Alzheimer's disease (AD) is a progressive neurological disorder that affects millions of people worldwide. While
significant efforts have been made to develop treatments targeting specific neuropathological conditions, a considerable
number of individuals maintain normal cognitive function despite the presence of these AD-related neuropathological
conditions. This suggests the existence of "cognitive resilience” factors that maintain cognitive function independently.
Meanwhile, perturbation screen techniques (such as CROP-seq and Perturb-seq) have enabled us to validate the
molecular function via high-throughput perturbation profiling. More importantly, such resources provide opportunities
for in silico modeling the synergistic effects from unseen combinations, which will exponentially increase the space of
perturbation effects. However, these high dimensionalities of single-cell features require advanced computational
approaches to efficiently extract biologically meaningful information. Additionally, batch effects and technical variability
can introduce confounding factors during data integration. Conventional statistical models and machine learning
methods fall short in handling this complexity, while current deep learning models often lack biological interpretability.
To bridge these gaps, our proposal aims to uncover these cognitive resilience factors by developing an interpretable
generative transcriptional program (iGTP) to investigate the genes and pathways associated with cognitive resilience and
identify potential highly correlated synergistic perturbations, which offers a new approach to developing therapies for
prevention and early
clinical
intervention
of AD.
We propose two specific aims to fulfill these goals. Aim 1: To engineer a deep learning framework that constructs
embedding layers with biologically interpretable TPs to model cognitive resilience factors. First, we will leverage
multimodal (genetics, transcriptome, clinical) data to identify those `high resilient' individuals and explore the molecular
signatures of resilience groups. Then, we construct our interpretable generative transcriptional program (iGTP) model to
correct the batch effect and project the cells to latent space (Z) composed of pre-defined TPs, where their biological
relevance will be depicted by their weights in each TP dimension. Aim 2: To Model gene perturbation in the unified TP
space and predict synergistic effects which may enhance cognitive resilience. First, we will utilize cutting-edge
bioinformatics analytical pipeline to quantify the perturbation profiling with single- and multiple- sgRNA perturbations.
Next, our iGTP framework will harmonize the perturbated cells to the same unified embedding space (Z). Our iGTP
model will predict synergistic effects using single perturbations and validate their effects with corresponding - sgRNA
perturbations. Lastly, we will predict perturbation combinations that potentially enhance the resilience factors.
The successful completion of our project will provide 1) an interpretable model, facilitating the exploration of AD
resilience on biologically meaningful dimensions and supporting in silico assessment of the counterfactual prediction
between desired cellular status shift (such as cognitive resilience) and synergistic effects of multiple perturbations. 2)
immediate implications for AD research communities and has the potential to be generalized to other diseases.