Project Summary
Novel statistical and computational tools have enabled the broad adoption of genomics technologies and served
as the foundation for the modern age of human genomics. Indeed, the recent advent and popularity of single cell
genomics platforms – most notably single cell RNA sequencing (scRNA-seq) – has led to a proliferation of single-
cell data processing, QC, and analysis frameworks. However, to fully realize the promise of single cell genomics
approaches we need to connect cell-type level regulatory phenotypes with complex disease. The most promising
approach to this challenge is to identify functionally relevant genetic variation by mapping quantitative trait loci
(QTLs). Indeed, studies identifying regulatory variation associated with a number of regulatory phenotypes,
including but not limited to gene expression (eQTLs), DNA methylation (meQTLs), chromatin accessibility
(caQTLs), and protein (pQTLs), have been carried out extensively in bulk samples. These studies have
advanced our understanding of the molecular underpinnings of complex disease, but the lack of granularity
provided by bulk analyses continues to hinder progress. As we move these approaches towards the cell-type
level analyses enabled by single cell genomics it has become clear that the methods developed for bulk samples
are not well suited to handle the complexity and specific characteristics of single cell data; or indeed to take full
advantage of the resolution and richness of single cell data. Here we propose developing, validating, and
deploying methods for mapping QTLs using data from single cell `omics technologies. We believe it is critically
important to build these methods using relevant data obtained from primary human tissue in a disease state.
Thus, we will jointly collect scRNA-seq, scATAC-seq and single cell protein levels from two tissue types: lung
and peripheral blood collected from patients with pulmonary fibrosis (PF) or healthy controls. This data collection
will be facilitated by our existing biorepository and build upon our expertise building tools for analyzing genomic
data, mapping QTLs for regulatory phenotypes, and analyzing scRNA-seq collected from lung tissue from
patients with PF. Using these data we will build methods for univariate scQTL mapping, multi-omic scQTL
mapping, the identification of context specific scQTLs, and integration of scQTL results with the results from
GWAS studies. These methods will be released as open-source software packages enabling broad adoption by
the field. Our team brings together unique expertise in statistical genomics, computational biology, functional
genomics, single cell genomics, and disease specific expertise in PF making us particularly well suited to carry
out this work.