Project Summary/Abstract
Transcription factors (TFs) are sequence-specific DNA-binding proteins that dictate cell fate by controlling
selective usage of genomic information. Not surprisingly, malfunctioning TFs result in a broad range of
diseases including cancer, autoimmune diseases, diabetes, and congenital heart defects. Mutations in cardiac
transcription factors (TBX5, NKX2-5, and GATA4) have been linked to congenital heart diseases (CHD). CHDs
represent the most common form of birth defects and are diagnosed in nearly 40,000 births every year in the
US. Cardiac TFs (TBX5, NKX2-5, and GATA4) are part of an intricate gene regulatory networks that controls
heart development. The long-term goal of the proposed work is to uncover the molecular mechanisms by
which transcription factors (TFs) decode genomic information, and how genetic variation modulates TF-
genome interactions that leads up to diseases. In Specific Aim 1, we will uncover the DNA-sequence
specificity of multi-TF complexes involved in heart development and function. Using in vitro selection
experiments coupled to next-generation sequencing (SELEX-Seq), we will determine the DNA-binding
specificity of TBX5, NKX2-5, and GATA4 complexes. Our experiments will reveal the complex grammar of TF
binding sites (TFBS) arrangements use by cooperative TF-complexes to regulate gene expression. Given their
essential role in heart development and cardiac function, non-synonymous mutations of TBX5, NKX2-5, and
GATA4 have been discovered in patients with a range of congenital heart defects (e.g., Tetralogy of Fallot,
Holt-Oram syndrome, atrial and ventral septal defects). Several of these missense mutations have been
mapped to the DNA-binding domain of the cardiac TFs. We hypothesized that missense mutations in the DNA-
binding domains of cardiac TFs results in altered DNA-binding specificity landscapes, and subsequently
reshape the gene regulatory networks controlling normal heart development and function. In Specific Aim 2,
we will measure the comprehensive protein-DNA interactomes of missense mutants of TBX5, NKX2-5, and
GATA4 using SELEX-Seq, and use bioinformatics tools to predict changes in gene targets between wild-type
and mutant TFs. Moreover, over 93% of genetic variants associated with cardiovascular diseases occur in non-
coding regions of the genome. These variants can potentially alter how TFs interact with the genome, either by
disrupting existing or creating novel TF binding sites, and rewire gene regulatory pathways controlling cardiac
homeostasis. In Specific Aim 3, we will integrate intrinsic DNA specificity of cardiac TFs with non-coding
genetic variants associated with cardiovascular disease to identify putative TF binding sites that get disrupted
or created. Disruption or creation of TFBS of cardiac TFs will be validated by in vitro binding experiments.
Completion of the proposed project will enhance our understanding of how genetic variation (coding and non-
coding) contributes to normal heart development and to congenital heart diseases at the molecular level.