Project Summary/Abstract
Deep learning (DL) has been widely applied across all life sciences to construct predictive models. However, it
relies on the assumption that training samples are independent and identically distributed. This is frequently
violated in the life sciences, where data is “grouped” by measurements from the same sample (patient, cell,
tissue), by the same observer, or at the same site. This leads to clusters of correlated data (random effects), and
when the models are fit to such data, the model parameters can be severely biased, leading to type I and II
errors. Proper accounting for such dependencies in DL models has gone unsolved. The objective of this proposal
is to develop the appropriate DL modifications to separately model global fixed effects and random effects that
increase model interpretability and performance for precise unbiased predictions related to human disease.
Our proposal is based on a novel, model-agnostic framework to transform conventional DL models into proper
mixed effects DL (MEDL) models. This affords capabilities of statistical linear mixed effects models, including
the separation of cluster-invariant fixed effects from cluster-specific random effects, while preserving the ability
of DL to learn data-driven nonlinear associations. The core premise is that proper MEDL models 1) are more
resilient to confounding effects and more attentive to true predictive features, 2) can capture, quantify, and
visualize random effects to enhance interpretability, and 3) attain better generalization to new clusters. We
propose to incorporate MEDL into three of the most important DL model types including dense feed-forward
neural networks (DFNNs), convolutional neural networks (CNNs), and autoencoders. Our preliminary results
demonstrate multiple advantages of MEDL over conventional DL in both accuracy and interpretability. MEDL
outperforms previous clustered data approaches including: domain adversarial models, meta-learning, and the
inclusion of cluster membership as an input covariate. We developed an ME-DFNN to predict conversion from
mild cognitive impairment to Alzheimer’s Disease (AD) from tabular data, an ME-CNN to diagnose AD from MRI,
and an ME-autoencoder to compress and classify live cell images. Across these test cases, MEDL models were
the most discriminative between known confounded and real features; they were able to quantify or visualize the
random effects and outperformed other models on clusters both seen and unseen during training. This proposal
further develops the methods to handle complex architectures and hierarchical effects, with external validation,
through these aims: 1) Develop ME-DFNNs for classification and regression. 2) Develop 3D ME-CNNs and multi-
modal 3D ME-CNNs for medical image classification. 3) Develop convolutional and vector ME-autoencoders for
image and omics data. We describe the innovative incorporation of an adversarial classifier to constrain the base
model to learn fixed effects, a Bayesian random effects subnetwork, and an approach to apply random effects
to unseen clusters. All these solutions will be released as open source software that improve existing DL models
to ultimately support precision biomedicine for the study and treatment of human disease.