Novel Statistical Methods for Multi-omics Data Integration in Alzheimer's Disease
Principal Investigators: Chong Wu, Ph.D. (contact); Jonathan Bradley, Ph.D.
Summary
A fundamental public health need is to understand the genetic basis of Alzheimer’s disease (AD) to enable
enhanced screening and preventive therapies. To date, genome-wide association studies (GWAS) have
identified more than 30 late-onset AD risk loci. These identified risk loci only explain a modest proportion of the
late-onset AD heritability, which motivates the development of transcriptome-wide association studies (TWASs).
TWASs test for predicted gene expression-trait associations by leveraging individual-level gene expression
reference data and successfully enhanced the discovery of genetic risk loci for many complex traits, including
late-onset AD. Even though many TWASs related methods have been proposed and investigated, critical gaps
remain. First, existing TWAS methods predominantly require expression reference panels, which can be
limited in sample size and create a challenge when developing prediction models. Second, while many
expression prediction models have been built, no statistical method has been proposed to build a new
expression prediction model that combines/uses existing expression prediction models. These critical gaps in
knowledge deter the statistical power of detecting gene-trait associations by TWAS, which is ultimately needed
to gain potentially transformative insight into the genetic basis of AD. In response to PAS-19-391, this project’s
overall objective is to develop statistical methods and software for improving the power of TWAS and offering
biological insights into AD. Our central hypothesis is to maximize the power of TWAS either by using eQTL
summary data with much larger sample sizes as the expression reference panel or by leveraging existing
expression prediction models. To test our central hypothesis, we will 1) develop expression prediction models
by leveraging eQTL summary data, 2) develop expression models by integrating existing expression prediction
models, and 3) develop open-source, cross-platform, publicly available, easy-to-use software to implement the
proposed methods. The overarching aim of this study not only enhances the power of the widely used method
TWASs but also offers biological insights into AD pathology. The proposed new methods can also be applied
to other complex diseases.