Project Summary
Early disease prevention, detection, and intervention are fundamental goals for advancing human health. Meanwhile, genetic
risk is, for all intents and purposes, the earliest significant contributor to common, heritable, disease risk. Thus, in theory,
genetic profiling should be the ideal tool for early disease prevention. Yet, genetic factors are rarely used directly to predict
future disease risk. Rather, genetic information is typically relegated to phenotype-first scenarios: providing or confirming
diagnoses for individuals with overt disease or clarifying the genetic risk for individuals with a strong family history of
disease. For modern genomics to make a significant impact on disease prevention the use of genomic information must
transition to a genotype-first approach; prediction of genetic disease risk in otherwise healthy individuals. A major barrier
to this transition includes our limited ability to predict the precise array of risks and likely phenotypic expression of disease
in an individual from genetic and other risk factors. The degree of disease risk and phenotypic expression conveyed to any
single individual by genetic factors is a result of a complex interplay between direct and indirect genetic effects, other
unmodifiable risk factors (age, gender, ancestry, family history), and intermediate modifiable risk factors (environment,
behavior, laboratory values, health status, therapy status, etc.) many of which have their own direct genetic mediators. New
approaches are required to dissect this interplay in order to personalize and contextualize preventative actions that most
effectively reduce overall disease risk. The overarching goal of this proposal is the development of innovative Deep learning
and machine-learning approaches to integrate baseline genetic risk predictions with the measurement of traditional risk
factors in order to provide more accurate and actionable predictions of disease risk. By tying genetic risk to traditional risk
factors, especially modifiable risk factors, we will enable actionability by allowing both a determination of preventative
actions that may be especially effective because they offset genetic risk, as well as the identification of modifiable risk
factors that should be monitored and controlled proactively given increased genetic predisposition. To accomplish this goal,
we propose to develop methods to: (1) infer the likely phenotypic expressivity of monogenic risk variants via a spatial
covariance machine learning approach, (2) predict prevalent disease cases and the expected value of intermediate modifiable
risk factors from polygenic and other unmodifiable risk factors, and finally (3) predict prevalent disease cases through
interactions between baseline genetic expectations and observed (measured) intermediate modifiable risk factors in a deep
learning framework. Adjusting age and modifiable risk factors in these trained models would then allow for the interactive
projection of future disease risk and the identification of modifiable risk factors that, when manipulated, lead to the greatest
change in future disease risk. We focus on the development of methods for coronary artery disease given its public health
importance, the known utility of polygenic risk estimation, and the current evidence for polygene-by-environment
interactions. In addition, the approach we propose integrates directly with current clinical decision support tools for coronary
artery disease management. However, we will build a general framework that can be extended to any common heritable
adult-onset condition, especially those with known heritable, traditional risk factors