PROJECT SUMMARY
Congenital heart disease (CHD) is a group of severe birth defects that collectively represent the leading cause
of birth defect-associated illness and death. Despite the extensive use of clinical genetic testing and whole exome
sequencing (WES), less than a third of CHD cases can currently be accounted for by mutations in protein-coding
genes. Many of the remaining, currently unexplained cases are assumed to be due to non-coding sequence
variants that alter the expression of genes essential for cardiac development. To uncover non-coding variants in
CHD patients, the National Heart, Lung, and Blood Institute's Bench to Bassinet (B2B) and TopMed programs
are using whole genome sequencing (WGS) on large CHD patient cohorts, principally for probands whose prior
WES failed to uncover a likely causative coding variant. WGS of 1,831 patient-parent trios from the B2B cohort
is currently available, with several hundred additional trios currently being sequenced. Initial analyses of ~750
probands have already identified over 2,000 de novo variants in predicted fetal human heart enhancers, along
with a statistically significant excess of genetic loci (27 genes versus 3.7 expected, p=1x10-5) at which the
neighboring human fetal heart enhancers showed multiple de novo variants in cases. This suggests that CHD
risk is conferred through dysregulation of the respective target genes of these enhancers. However, the causality
of these variants in CHD, as well as the molecular underpinnings of their potential pathogenicity, remain to be
demonstrated. Building on our extensive previous work in mapping and characterizing cardiac enhancers at
scale, we propose to perform systematic in vivo functional validation of de novo sequence variants from CHD
patients that reside in predicted heart enhancers to reveal enhancer mutations that contribute to the etiology of
CHD. We will 1) use a combination of comprehensive maps of predicted human heart enhancers, genetic and
epigenomic analysis tools, and massively parallel reporter assays in cardiomyocytes differentiated from induced
pluripotent stem cells (iPSC-CMs) to identify and prioritize cardiac enhancers harboring de novo variants from
CHD patients, 2) use our world-class mouse transgenesis pipeline in combination with novel single-cell
characterization methods to test the reference and variant alleles of 200 prioritized enhancers (400 alleles in
total) at appropriate stages of cardiac development to assess how the risk alleles alter enhancer function in vivo
at cellular resolution, 3) use CRISPR/Cas9 genome engineering to generate 20 knock-in mouse models for
human CHD variant alleles that alter enhancer activity and matched human reference alleles to assess their
impact on the structure and function of the heart using a combination of single-cell transcriptomics and cardiac
phenotyping. Successful completion of the proposed studies will provide foundational insights into the role of
non-coding regulatory sequences in the most common severe human birth defect, identify specific examples of
human enhancer variants conclusively implicated in disease, and provide initial mechanistic insights into their
respective mode of action to provide new avenues for exploring future therapeutics.