ABSTRACT
Alzheimer's disease (AD) is the most common dementia affecting more than six million people in the United
States. The complex genetic risk and the lack of disease-specific biomarkers for AD are among the most
challenges that investigators and clinicians face in early prediction, diagnosis, prevention, and intervention.
There is an urgent need for early identification of individuals with higher risk before the onset of symptoms. With
the rapid accumulation of genetic data, researchers have developed high-performance genetic models to predict
complex diseases including AD. For example, polygenic risk score (PRS), designed to estimate individual genetic
liability by integrating large GWAS summary statistics and individual genotype data, has provided a potential
value to predict diseases like AD. Recent artificial intelligence (AI) coupled with promising machine learning (ML)
techniques have been shown to yield meaningful insights when applied to “Big Data”. Convolutional neural
network (CNN), a machine learning algorithm widely used in image and object classification, has shown
informative results in the medical field aiding image data analyses. However, the application of CNN to non-
image data such as genetic data is limited. Lately, our group has developed an artificial image objects (AIOs)
method to transform tabular data into images. Uniquely, our AIO technique not only allows us to adapt CNN
algorithms to classify disease but also identify biomarkers associated with the disease. Our preliminary study is
encouraging: 1). CNN with single nucleotide variant (SNV)-transformed AIOs improves disease classification in
schizophrenia; 2). CNN with RNA-seq data-transformed AIOs facilitates biomarker discovery in breast cancer;
3). CNN with PRSs-transformed AIOs from multiple genetically correlated traits performs better in AD prediction,
as compared with the conventional logistic regression model with PRSs from AD alone. We hypothesize that
CNN models with PRSs from multiple genetically correlated traits can improve AD classification and identify the
biomarkers for early prediction and therapeutic targets. To test this hypothesis, we propose the following aims:
1: To build and validate the prediction model for AD classification using CNN algorithms and mtPRS-
transformed AIOs. 2: To identify and validate biomarkers specific to AD by integrating multi-omics data
and CNN algorithms. The approach is innovative in that we are the first to transform PRS and SNV genetic
data into AIOs and apply AI/CNN for AD classification and biomarker identification. We are also the first to
integrate PRS from multiple comorbid traits for AD prediction. The application is significant because we will
promote AD prevention with a high-performance prediction model that can identify high-risk individuals at an
earlier stage and identify disease-specific biomarkers for drug discovery. The overall objectives of this R15 AREA
grant are to 1) Develop a CNN model to identify individuals with higher risk for AD; 2) Identify biomarkers for
future drug development; 3) Accelerate research activities at UNLV through collaboration with faculty members
and students that will enhance career development for the students and investigators.