Project Summary
Despite significant progress toward deciphering the genetic cause of many rare disease phenotypes in the
NHGRI Centers for Mendelian Genetics (CMG), more than half of the genes underlying Mendelian diseases
remain undiscovered. However, remarkable developments in genomics technologies and the aggregation of
massive reference datasets are poised to advance Mendelian gene discovery, provided these technologies can
be exploited using sophisticated analytic tools in large and diverse cohorts. Importantly, these methods and
datasets can only catalyze gene discovery if they are rapidly shared with the community, and enabling this goal
has been a primary focus of our Broad Institute CMG. Here, we bring together an extraordinary team of
investigators with diverse expertise, complementary technologies, novel analytic methods, an established
recruitment network, and platforms that we have developed for data sharing to explore the genetic underpinnings
of Mendelian disease, and to further enable therapeutic development for rare diseases.
The Broad Institute Mendelian Genomics Research Center (MGRC) builds upon the world-class track record of
Mendelian gene discovery, methods development, and data sharing set by the Broad CMG. Our team has
invested considerable effort to develop widely adopted tools and platforms empowering variant analysis, such
as GATK, gnomAD, and seqr, and to facilitate open sharing of variants, data, and analysis tools. Over the last
four years, we have generated and shared data for over 15,000 samples from 7,600 families. In the process, we
have uncovered 256 novel disease-gene relationships, with 473 additional genes undergoing follow-up.
Our MGRC roadmap will rely on exome sequencing and rapid data sharing as the most efficient frontline
approach, given that the vast majority of CMG discoveries are derived from coding variants, followed by genome
sequencing on unsolved cases (Aim 1). Complementary approaches to discover variation not captured by
conventional methods will include emerging sequencing technologies, reference-free assembly, improved
annotation of evolutionary constraint, large-scale data aggregation, and novel analytic methods. Transcriptome
sequencing, epigenetic profiling, and CRISPR editing, as well as in vitro and in vivo functional modeling, will then
inform functional interpretation and mechanistic dissection (Aim 2). Finally, we will use our platforms to create
new tools and approaches for transformative data sharing across the MGRCs and the broader community (Aim
3). At their conclusion, these studies will significantly contribute to completing the catalog of genes underlying
Mendelian disease, providing new biological insights into their functional mechanisms, and openly sharing the
data, tools, and discoveries that we produce.