A new reference panel to boost African American genotype imputation - Project Summary
Modern genetic studies have been conducted predominantly in cohorts of individuals of European
ancestry. By 2010, there were approximately ten times as many published genome wide association
studies (GWAS) in people of European ancestry than studies in people of all other ancestries combined.
This research disparity has led to an uneven understanding of the genetic basis underlying disease in
Europeans and non-Europeans.
23andMe's web-based, large scale research model is ideal for scaling genetics research within
non-European populations and thereby bringing more parity to genetics research. Our database is
composed of genotypes and phenotypes of over 1,000,000 consenting customers, including over 200,000
individuals with non-European ancestry. The data derived from non-European individuals represent a
particularly valuable resource for genetic discovery of novel variants that may not be found in the European
population. However, research studies in non-European populations are weakened by the lack of
availability of large-scale reference datasets and, in particular, genotype imputation panels.
Genotype imputation is a statistical methodology that uses observations of genotypes in a large reference
panel to infer unobserved genotypes in a target dataset. This methodology is widely used within GWAS,
and allows novel genetic associations to be identified and refined. Due to this utility, very large reference
panels have been constructed, containing thousands or tens of thousands of whole genome sequences.
Unfortunately, the largest imputation panels are composed of predominantly European genomes, reflecting
the modern bias towards European studies in GWAS.
This proposal aims to address this imbalance by constructing an imputation panel specifically for the
African American population. In doing so, we will expand 23andMe’s ability to perform genetic discovery in
non-European populations, and improve the understanding of global genetic variation underlying diseases
and traits. Key commercial outcomes of the research include the identification of novel genetic targets for
internal and external therapeutic development. The long-term aim is to improve understanding of disease in
minority populations, which we hope may eventually lead to improved treatments of disease in these
historically medically understudied groups.