Discovery of rare splicing variants in whole blood and lung - PROJECT SUMMARY Genetic factors have been shown to contribute to COPD susceptibility, and genome-wide association studies (GWAS) have identified 82 loci associated with COPD risk. We have previously found associations of many COPD risk-related GWAS-identified variants with alternative splicing; specifically, we have observed that genetically driven splicing of genes including NPNT, FBXO38, and BTC contributes to COPD risk. However, common variants associated with disease have modest effects on disease risk, and the genetic loci identified to date through GWAS explain only 5-10% of the heritability of COPD or measures of lung function. While GWAS effectively identifies common genetic variants associated with disease, rare variants such as those in alpha-1 antitrypsin deficiency, cutis laxa, cystic-fibrosis transmembrane conductance regulator and telomere- related genes also contribute to COPD risk, as has been found for other complex diseases. Recent large-scale studies using genome-wide and exome sequencing have identified vast numbers of novel variants and have provided an opportunity to investigate the impact of rare coding variants on complex human traits and diseases. However, a major challenge in the interpretation of rare variants lies in the currently limited ability to infer the functional and clinical impact of those variants. As for common variants, it is likely that many rare variants influence the regulation and expression of causal genes. Variants that alter splicing hold a particular interest, as they can lead to drastically altered RNA isoforms, such as through frameshifts or loss of functionally important domains. Our central hypothesis is that rare genetic variants contribute to COPD by causing aberrant splice events with large effects. This work will build directly from Dr Saferali’s (PI) K01-funded project that identified common genetic variants that contributed to COPD by modulating transcriptional splicing. In this proposal we will utilize resources from two large cohort studies: the Lung Tissue Research Consortium (LTRC) dataset and the Genetic Epidemiology of COPD Study (COPDGene). In Aim 1 we will discover outlier splicing events using RNAseq data from LTRC lung tissue and COPDGene whole blood, identify rare variants causal for those events, and test for association between those variants and COPD and related phenotypes. In Aim 2 we will use long read sequencing to identify full length isoform sequences in individuals with outlier splicing, followed by prediction of the protein-level impact of the splice event. In summary, this proposal from a productive early-stage investigator leverages data and skills acquired during the PI’s K01 funding period and substantial resources from ongoing cohort studies to explore a new approach to identify genes and mechanisms that contribute to COPD and related phenotypes. Importantly, this work proposed will yield urgently needed insights into the role of rare variants in COPD risk while also generating important preliminary data to form the basis of the PI’s future independent research program.