Ischemic stroke is the 4th leading cause of death in the U.S. and a major cause of disability. The etiology
of stroke is multifactorial and poorly understood. Genetics is a potentially powerful tool for better understanding
disease etiology as it can highlight biological mechanisms underlying disease and point the way to improved
prevention, treatment, and outcome. Large genome-wide association studies (GWAS) of ischemic stroke (IS)
populations have been successful at identifying stroke-risk-associated loci with small effect sizes, however,
the role of copy number variation (CNV) variation in stroke susceptibility has yet to be explored, and is the
premise of our proposal. Studying CNV has revealed important insights for numerous other complex diseases.
Further, we recently demonstrated that a higher CNV burden genome-wide is associated with poorer stroke
outcome at 3 months. We therefore hypothesize that CNV analyses of existing GWAS and exome data will be
a highly effective and cost-efficient methodology to identify novel associations illuminating stroke mechanisms,
treatment targets, and outcome drivers. We further speculate that these analyses will identify CNVs of large
effect size in ischemic stroke, as suggested by the existence of numerous monogenic, syndromic and complex
diseases associated with CNV and that CNV may help explain the ‘missing heritability’ known to exist in stroke.
For this application, we have already assembled over 24,500 well-phenotyped IS cases, including IS
subtypes, and over 43,500 controls, all with readily available genotyping on GWAS and exome arrays, with
case measures of stroke outcome. To evaluate CNV-associated stroke risk and stroke outcome we will: 1)
perform Risk Discovery using several analytic approaches to identify CNVs that are associated with the risk of
IS and its subtypes, across the age-, sex- and ethnicity-spectrums; 2) perform Risk Replication and Extension
to determine whether the identified stroke-associated CNVs replicate in the ethnically diverse TOPMed
Consortia and then using existing TOPMed and GeneStroke Consortium biomarker data (e.g. methylation,
proteomic, RNA, miRNA, etc.) evaluate how the identified CNVs exert their effects on stroke risk, and lastly;
3) perform outcome-based Replication and Extension analyses of our recent findings demonstrating an inverse
relationship between CNV burden and stroke outcome at 3 months (mRS) in these additional datasets, and
then determine the key CNV drivers responsible for these associations using existing biomarker data.
Our study will leverage the numerous advantages of using existing case-control data sets, exploring the
relationships between CNV and IS and its subtypes, and outcome at 3 months, across the sex-, age- and
ethnicity-spectrums. The proposed study creates a new training network for junior investigators and establishes
a unique resource for the continued study of the genetic basis of IS. The successful identification of novel
genes, pathways and drug targets has the potential to transform our understanding of the stroke
pathophysiology leading to more effective prevention, treatment and outcome strategies.