Statistical methods and analyses to study genetic variants and their roles in diseases leveraging functional genomics data. - Project Summary Common genetic variation is an important player in human diseases. One central goal of human genetic studies is to identify causal genetic variants in diseases and understand the mechanism. Many genome-wide association studies (GWAS) have been performed to identify associations between genetic variants and a myriad of human diseases. However, moving from GWAS results to identification of causal variants, and a mechanistic understanding of how the variants elicit diseases remains a major challenge in the field. This challenge is what I aim to address in my research program. Recent research efforts have led to the generation of large scale functional genomic datasets, in particular single cell transcriptomic and epigenomic data. Because most common variants are located in noncoding genomic regions and their functional effects are mostly unknown, such functional genomic datasets have the potential to provide important information about the variant’s functional role, for example if a variant has gene regulatory effect, which cell or tissue type it has an effect in, if such an effect is related to diseases, etc. However, the current methods and analyses used by researchers in the field are unable to garner such information from existing data, so critical gaps in connecting variants to diseases exist. The overall goal of the PI’s research program is to develop the new statistical methods and analyses needed to leverage functional genomics data, in conjunction with GWAS data, to understand variants’ functional effects and their roles in diseases. This goal will be achieved by advancing three key areas: (i) Identification of response expression quantitative trait loci (eQTLs), which are genetic variants that are associated with gene expression only under certain conditions. A powerful response QTL mapping pipeline will be established and used to study response QTL properties and relevance to diseases. (ii) Identification of disease critical cell states using single- cell chromatin accessibility profiling data. Single-cell chromatin accessibility profiling data provide a high- resolution view of cellular regulatory landscapes; novel methods will be established to assess the relevance of these different cellular states to diseases. (iii) Identification of effect context for individual causal variants. A variant may affect a disease through one or a few cell/tissue types relevant to the disease, but this is often not known. Work in this third research area will establish a statistical model that leverages multiple types of functional genomics datasets to address this question. The PI’s work in these areas will yield critical insights about the effect of genetic variation and disease etiology. New approaches and open-source tools for studying common genetic variants and disease genetics will be established. These tools are greatly needed by the research community to make full and effective use of the fast-accumulating functional genomics and GWAS datasets.