Population genomics in laboratory and outbred mouse populations - PROJECT SUMMARY The level and patterning of genetic diversity within populations reflects the interplay of mutation, recombination, natural selection, and demographic history. Knowledge of the intensity, frequency, distribution, and genetic regulation of these elemental evolutionary processes is therefore essential for understanding the evolutionary origins of disease-associated variation and predicting the evolutionary fate of genomes. This research program will unlock new basic biological understanding into the evolutionary mechanisms that give rise to genomic variation, with a particular focus on repeat-rich genomic loci. Our work will draw on the strengths of the mouse model system, including its translational relevance to humans, genetically diverse inbred and outbred strain resources, large whole genome sequence datasets from pedigreed populations, tools for functional discovery, and museum-archived samples for wild-caught mice. The proposed program is organized into three major research areas. The first research focus will pursue in-depth investigations of repeat-rich functional chromatin domains responsible for chromosome segregation, genome stability, and fertility. Using state-of-the-art long- read sequencing methods, we will generate high-quality de novo assemblies for diverse inbred mouse strains and pursue focused comparative genomic studies of two highly repetitive genomic loci: the Y chromosome and centromeres. We will complement these genomic investigations with hypothesis-driven studies to assess the functional consequences of diversity at these loci for male fertility and the fidelity of chromosome segregation, respectively. Second, we will retroactively mine laboratory mouse reference mapping populations as controlled evolution experiments. By monitoring the flow of genetic information over multiple generations, we aim to characterize variation in germline mutation rates, map mutation rate modifiers, and identify phenotypic correlates of mutation rate heterogeneity. Third, we will lead the development and analysis of a large-scale whole genome sequence resource comprised of ~1000 wild mouse genomes. Recognizing that deleterious variants are maintained at low frequency in the wild by the action of natural selection, we will identify variants in inbred mouse strains that are individually rare in wild mouse populations, and use allele frequencies in wild mice as a criterion for prioritizing likely deleterious causal alleles within legacy mouse QTL mapping datasets. Together, these investigations will make substantial inroads to our understanding of the mechanisms of genome diversity at structurally complex functional chromatin domains and the genetic and environmental controls on mutation rate. In addition, this program will create a new genome sequence resource that will empower biomedical and basic discovery in the mouse model system and solidify the PI’s leadership in the mouse genomics community. Crucially, the proposed program will also create meaningful hands-on research training opportunities in population and evolutionary genomics, computational biology, and mouse genetics at the undergraduate, postbaccalaureate, graduate, and postdoctoral levels.