Integrating Genetic and Environmental Data to Predict Autism Susceptibility and Heterogeneity in the SPARK Cohort - Abstract Although autism is known to be highly heritable, the lack of 100% concordance in monozygotic twins and stronger concordance between dizygotic twins over full siblings offers strong evidence for environmental susceptibility factors. There is accumulating evidence for several non-genetic factors, including air pollution, pesticides, heavy metals and pregnancy-related complications. However, studies of environmental susceptibility have been limited by modest sample sizes, indirect measurement and incomplete characterization of exposures, and lack of a multi-dimensional approach that accounts for the interplay between genetic and environmental factors that impact the probability of autism as well as its heterogenous trajectories. The infrastructure already in place in SPARK, a large, US-based cohort of >150,000 recontactable people with autism, offers a unique opportunity to implement a large-scale integrative study design to identify prenatal and early life exposures that impact autism and how exposures interact with genetic risk. Our long-term goal is to identify exposures that influence autism and understand the molecular mechanisms by which these exposures impact the genome, epigenome, and metabolome. The objective of this proposal is to characterize prenatal and early life exposures in >20,000 children with autism and identify exposures that impact probability of autism, developmental trajectories in autism and response to behavioral intervention. We will also investigate the interaction between environmental exposures and genomic risk. In our first aim, we will use multivariate regression and machine learning methods to identify geospatial exposures associated with social communication abilities and response to educational/behavioral intervention in thousands of individuals with autism. In the second aim, we will evaluate gene-environment interactions using the geospatial exposome and the entire distribution of autism genetic risk variants, including common and rare variants, in thousands of individuals with autism in SPARK. In the third aim we will directly measure exogenous exposures and endogenous biological responses in the perinatal period by performing untargeted high-resolution exposomics on residual newborn blood spots from a subset of the SPARK cohort. We will also perform long-read DNA sequencing to assess DNA methylation epigenetic signatures associated with environmental exposures. Ultimately, our goal is to build a model that can help clinicians and families assess genetic and non- genetic probability of autism and predict response to treatment to maximize the potential of individuals with autism. The findings from this study will be significant because this will be the first large-scale investigation that comprehensively evaluates the interplay between exposomic and genomic risk factors in autism and will yield novel insights into the mechanisms that drive risk and resilience in this complex condition. Figure 1: Study overview. Residential history will be collected in 20,000 children with autism and siblings born after 2000. We will investigate environmental interactions with the genome, epigenome and metabolome.