High-throughput cellular genetics to connect noncoding variants to coronary artery disease genes - ABSTRACT
Despite effective therapies to control blood pressure and cholesterol, coronary artery disease (CAD) remains
the leading cause of death in the United States and the world. Recent genome-wide association studies (GWAS)
have identified >200 genetic loci significantly associated with CAD. The majority of these loci are not associated
with hypertension or hyperlipidemia, but contain genes expressed in vascular cells, suggesting the presence of
undiscovered CAD disease mechanisms operating through cells of the blood vessel wall. Identifying the biologic
mechanisms of these loci is difficult because most of the common variants associated with CAD are non-coding,
likely functioning through enhancers that can regulate multiple genes at great distances, and be highly cell-type
specific. This has slowed discovery, so that only a few such loci have been ‘solved’, preventing the realization of
the full potential of genetic studies for the development of much needed new classes of therapies for CAD.
What is needed are much higher-throughput, unbiased methods to systematically dissect these loci, to identify
molecular and cellular mechanisms of disease. To accomplish this goal, we have developed and validated two
novel high-throughput approaches with the ability to quantify the molecular and morphological effects of
thousands of genes in CAD loci: optimized Perturb-seq (to link genes to transcriptional phenotypes) and pooled
perturbation Cell Painting (to link genes to morphological effects).
We propose to use these technologies, together with new and novel computational approaches, to: 1) analyze
the transcriptomic effects of perturbing CAD-locus genes in vascular endothelial cells (ECs) subjected to four
stimuli associated with atherosclerosis, 2) quantify the morphological effects of perturbing all CAD-locus genes
in ECs, 3) integrate these data with epigenomic and genetic data to nominate causal genes and build hypotheses
connecting variants to genes, and genes to disease-associated cellular phenotypes, and then test the top 5 of
each of these hypotheses using variant-editing and EC-functional assays. This includes several prioritized causal
variant hypotheses that regulate EC shear stress response upstream of KLF2 expression.
The completion of these studies will accelerate understanding of the links between CAD genetics and disease,
and provide insights into EC functions in CAD that may inform the development of new classes of therapies.
More broadly, this study will establish new scalable experimental and computational methods to link noncoding
disease variants to genes and cellular phenotypes, which could dramatically accelerate variant-to-function
studies for cardiovascular and other common diseases.