NHGRI U24: ATLAS OF REGULATORY VARIANTS IN DISEASE (ARVID)
Genome-wide association studies (GWAS) have identified thousands of single nucleotide
polymorphisms (SNPs) linked to risk of developing specific non-cancerous polygenic diseases,
including ischemic heart disease, chronic obstructive pulmonary disease, Alzheimer’s dementia,
type 2 diabetes, and ischemic stroke. These disease-linked SNPs concentrate in regulatory DNA
active in cell types that may mediate disease risk by modulating genes (eGenes) whose expression
levels may be important in pathogenesis. These disease-linked expression SNPs (eSNPs)
commonly alter transcription factor (TF) DNA binding motifs, indicating they may affect regulatory
DNA activity by changing gene regulator binding. This U24 proposal aims to generate a genomic
resource, the Atlas of Regulatory Variants in Disease (ARVID), containing the following 3 broad
categories of information: 1) the individual disease-linked human eSNPs with differential gene
regulatory function in relevant cell types 2) the target genes (eGenes) that these eSNPs
dysregulate and 3) the gene regulators whose DNA association such disease eSNPs alter.
First, we will identify the specific functionally altered eSNPs among those linked to index SNPs
identified by GWAS in the 5 widespread human diseases noted above using massively parallel
reporter assays (MPRA). A resulting subset of 300 top disease risk and non-risk eSNP pairs will
then be deeply characterized in isogenic cells generated by gene editing to identify directly and
indirectly dysregulated target genes. This effort will produce a Genomic Compendium of a) the
disease-linked eSNPs that quantitatively impact regulatory DNA function in disease-relevant cell
types and of b) the eGenes for the 300 top disease eSNPs.
Second, we will identify the specific gene regulators whose DNA association is altered at the 300
disease risk eSNPs above, compared to matched non-risk alleles. To do this, we will use a live-cell
proteomics approach termed DNA Protein Interaction Detection (DAPID). Quantitative mass
spectrometry using isobaric tagging will be complemented by quantitative chromatin
immunoprecipitation (ChIP) assays using isogenic, disease-relevant cells that differ only at the
single eSNP nucleotide of interest. This effort will produce a Proteomic Atlas of differential regulator
binding at 300 reference-disease eSNP pairs.
This NHGRI U24 will generate a genomic resource defining the DNA variants, target genes, and
gene regulators involved in inherited risk for 5 common non-cancerous polygenic human diseases.