Flexible multivariate models for linking multi-scale connectome and genome data in Alzheimer’s disease and related disorders - Project Summary/Abstract In the field of Alzheimer's and related dementias (ADRD), while multimodal data collection is common, most approaches still separately analyze and then combine the data. There is much less work on methods that can perform joint decomposition of multimodal neuroimaging and genomic data, despite considerable promise. Most imaging genomic studies have focused either on candidate gene approaches or on univariate approaches to link neuroimaging and genomic data (e.g., polygenic risk scores). Even for genome wide studies, the vast majority of imaging genomic studies still rely on massive univariate analyses. The use of multivariate approaches pro- vides a powerful tool for analyzing the data in the context of genomic and connectomic networks (i.e., weighted combinations of voxels and genetic variables). Imaging and genomic data are high dimensional and include complex relationships that are poorly understood. Our prior work has shown that incorporating multimodal mod- els that perform joint (i.e., symmetric) data fusion can reveal otherwise hidden information, and greatly increase sensitivity to ADRD. However, existing multivariate data fusion models suffer from three key limitations. First, most models assume a single subspace, i.e., do not explicitly allow for coupled sets of multimodal or unimodal components. Secondly, most models require the data dimensionality to be matched (i.e., the spatial or temporal scales must be similar, greatly limiting our ability to combine data with `mismatched dimensionality' such as fMRI and sMRI or genomic data) and limiting our ability to perform analyses that span multiple spatial and temporal scales. Finally, existing models typically assume linear relationships despite evidence of important nonlinearity in brain imaging and genomic data. To address these challenges, we propose a family of approaches covering a range of flexible modeling from linear to complex and nonlinear relationships. Our proposed models will also offer the ability to capture subspaces (groupings of unimodal or multimodal components), the incorporation of constraints or priors, and the ability to capture subgroups of individuals along a dimension of interest. Within this family of models, we propose a novel framework called progressive fusion which allows us to smoothly span spatio-temporal scales while allowing for multimodal subspaces. We will apply the developed models to a large longitudinal dataset of individuals at various stages of cognitive impairment and dementia. Using follow-up out- comes, we will evaluate the predictive accuracy of a joint analysis compared to a unimodal analysis, as well as its ability to characterize various clinical etiologies including those driven by vascular effects including subcortical ischemic vascular dementia versus those that are more neurodegenerative. We will also evaluate the single subject predictive power of these profiles in independent data to maximize generalization. All methods and re- sults will be shared with the community. The combination of advanced algorithmic approach plus the large N data promises to advance our understanding of Alzheimer's and related disorders in addition to providing new tools that can be widely applied to other studies of complex disease. 1