Enhancing PALM with novel statistical methods, cloud integration, parallelization, and imaging standards - PROJECT SUMMARY Statistics is the staple of all biomedical research. Yet, most existing statistical methods and software packages do not trivially extend to the analyses of datasets of the unprecedented size and level of detail as those being collected and made publicly available by the BRAIN Initiative and by the NIH Blueprint for Neuroscience Research. These datasets include the Adolescent Brain and Cognitive Development Study (ABCD) and the Human Connectome Project (HCP), among others. Dense phenotyping in these studies creates many opportunities for generation and testing of hypotheses about brain structure and function in health and disease. However, the lack of availability of flexible statistical methods that accommodate large-scale inferences, and the lack of suitable implementations even when such methods exist, hinders our ability to integrate data across different domains, scales, and representations used for study. This project is intended to modify the free and open-source analysis tool PALM (Permutation Analysis of Linear Models), augmenting its abilities to allow researchers to explore voluminous amounts of data using optimal and valid statistical methods, with minimal assumptions, and with a fast implementation that makes rigorous permutation tests computationally accessible. PALM already offers several important approaches and features not available in any other software, and has become a popular tool in brain imaging. The goals of this project are: (1) develop and implement a general approach to conduct analyses of repeated measures (longitudinal) and of genetic (familial) data using permutation tests, with an emphasis on BRAIN Initiative data; (2) develop and implement novel test statistics that are sensitive to spatial and temporal processes across the brain, even for data measured in different domains, scales, and representations; (3) expand PALM’s multivariate capabilities, to uncover statistically independent latent factors from multiple dimensions (e.g., cellular, behavioral, genetic, or imaging) that can span multiple disease categories; (4) disseminate existing and novel statistical methods with fast and efficient software implementation that makes use of parallelization when multithreading, graphics processing units (GPUs) and/or high performance computing systems (HPCs) are available, (5) ensure that PALM is compatible with recently developed data models and standards used by the BRAIN Initiative, and (6) ensure that PALM can be integrated with cloud repositories that host large datasets. The development of novel theory and algorithms will consider computational efficiency, standardized data models, and the hardware architectures available.