Linking microbiome genetic variants with cardiovascular phenotypes in 50,000 individuals - PROJECT SUMMARY / ABSTRACT
The human body is home to a complex community of microorganisms (“microbiome”) that differs in composition
between people, with numerous correlates to cardiovascular disease (CVD). Any two people will harbor different
strains of a given species, which can be more genetically different than a human and chimpanzee with <60% of
their genes shared. Even within a single person, each microbiome species may be a complex mixture of strains
with different genomes and functional capabilities. This striking within-species genetic diversity has functional
consequences for CVD, because gene loss and gain modify how strains process our diet, metabolize drugs, and
stimulate inflammation. Hence, a population genetic approach is essential for revealing causal links between the
microbiome and CVD.
We have compiled a deeply phenotyped cohort of ~50,000 individuals with metagenomic sequencing of their gut
microbiomes. This dataset includes ~8,000 people with atherosclerosis, thousands with measurements of heart
function and metabolic health, and hundreds with acute coronary syndrome. This cohort is a unique and ideal
setting to perform a well-powered CVD metagenome-wide association study (MWAS).
Several barriers must be overcome before MWAS can be deployed at this scale. First, we must reduce the
infeasible computational cost of genotyping thousands of microbiome species across ~50,000 people. Second,
to ensure that statistical tests for associations do not have high false positive rates we need statistical models
that adjust for microbial population structure within and across hosts. The goal of this proposal is to create a
research toolbox to address these challenges as well as to identify putative mechanistic links between
microbiome and CVD. We will develop data structures and query algorithms for accelerated genotype estimation
and mixed effects models for accurate association tests. All code and methods will be open source and designed
to be easily extended to other microbiome cohorts.
Applying these tools to our cohort, we aim to identify specific microbial genes and pathways responsible for
known associations between microbes and CVD. We also expect to discover new associations that were missed
because cohorts were too small or they were analyzed with methods that ignore differences in gene content
across strains. These findings will be used to identify microbial biomarkers for CVD diagnosis and personalized
treatments or to design microbiome targeted drugs, prebiotics, and probiotics to treat heart disease.