Abstract
The last four decades have produced an enormous catalog of human genomic variants which has the potential
to revolutionize human genetics. Among the variants identified in the human “variome” so far, some appear
benign, i.e. they don’t seem to confer any particular phenotype, a significant proportion are associated or
potentially associated with one or more genetically inherited disorders, but an even greater percentage of
observed human variants, 99% of missense variants, remain uninterpreted or annotated as variants of unknown
significance (VUSs). To translate this huge amount of genetic information into general principles underlying
genotype-phenotype relationships as well as molecular mechanisms responsible for the development of inherited
disease, there is an urgent need for large-scale, systematic, high throughput “functional characterization”
projects such as those envisioned within the new “Impact of Genomic Variation on Function” (IGVF) Consortium
proposed by NHGRI.
Although most monogenic Mendelian disorders are individually rare, when combined these diseases affect 20
million Americans. The ClinVar database describes within 3,671 Mendelian disease genes over 260,000
missense variants classified as pathogenic, benign, or VUSs. We currently lack strong and comprehensive
evidence to systematically analyze coding variants across the spectrum of human Mendelian diseases.
We propose to functionally characterize ~100,000 variants across most of the known Mendelian disease-
associated genes by comparing wild-type, or “reference”, gene products and their corresponding variants for a
rich array of fundamental protein properties and phenotypic impacts, including protein stability (expression),
subcellular localization, cell viability, cell morphology, and the ability to mediate macromolecular interactions with
protein partners.
Our Variant Characterization Across the Mendelian Proteome (VarChAMP) Center will generate a searchable
and widely available catalog of these variant effects via the IGVF Data and Administrative Coordinating Centers
(DACCs), and assist in the “Predictive Modeling Projects” to carry out variant effect predictive modeling using
this data. In addition to providing a rich source of functional information on tens of thousands of genomic variants
in the next five years, all of our concepts, technologies and resources generated during this project are exportable
and will be shared to enable others, both inside and outside the IGVF consortium, to leverage our approach in
their own studies and expand the catalog.