Computational Population Genetics - Project Summary The revolution in genome sequencing technologies over the past 15 years has created an explosion of population genomic data but has left in its wake a gap in our ability to make sense of data at this scale. This is in part because traditional population genetic models do not reflect the genomic and geographic processes producing the tremendously diverse data now being collected. To capitalize on this flood of information we need new methods and modes of analysis. In recent years our group has made great strides in using supervised machine learning for population genomic analysis (reviewed in part in Schrider and Kern 2018). In particular, we have pioneered the development and application of deep learning techniques for a wide variety of tasks including detecting selection (Kern and Schrider 2018, Xue et al., 2021), localizing introgression tracts in the genome (Schrider et al. 2018), characterizing the landscape of recombination (Adrion et al. 2020), predicting geographic origin (Battey et al. 2020), as well as visualizing population genetic data (Battey et al. 2021). A particular focus of our efforts during the last funding period has been understanding evolution of Anopheles gambiae in response to vector control efforts underway in sub-Saharan Africa, thus we have been and continue to develop statistical methods with these important data in mind. Our work on Anopheles has reinforced in us the importance of studying spatial variation, particular in the context of adaption. In this proposal we build upon ideas we have been developing during the previous funding period and discuss three facets of our ongoing research program. The proposal has three sections: 1) we will continue our work on spatial population genetics, and propose to develop methods for inferring dispersal parameters directly from population genomic data, as well as to improve our understanding of the ways in which spatial structure can impact GWAS and related techniques. 2) To develop methods to further characterize the population genomics of adaptation. In this section we are particularly interested in developing deep learning methods that account for the geographic spread of an allele relative to its surrounding pattern of genomic variation to discover beneficial alleles. In addition we will develop methods for discovering selection that build upon recent improvements in our ability to infer population-scale genealogies (Kelleher et al., 2019, Speidel et al., 2019). Finally, 3) we propose new avenues of development of a community resource project which our group has been leading, the stdpopsim project, which aims to provide an open source, highly reproducible and accessible method for doing population genetic simulation in a number of common study systems. We outline plans towards more realistic simulation of genomes under selection, with a particular focus on implementing previously published estimates of selective parameters. Moreover we will use the stdpopsim library to benchmark commonly used methods in demographic and selective inference