PROJECT SUMMARY/ABSTRACT
Centromeres and pseudoautosomal regions (PARs) are highly specialized chromatin domains that are
essential for proper chromosome segregation. Centromeres provide chromosomal points of attachment to the
cellular segregation machinery, linking chromosomes to the proteins that pull them to the cell poles during both
somatic and germline cell divisions. The PAR is a region of conserved sequence identity between the X and Y
chromosomes over which the meiotic program of pairing, synapsis, and recombination unfolds to ensure
correct sex chromosome segregation. Mutations that disrupt centromere integrity or reduce homology between
X- and Y-linked PARs can lead to chromosome segregation errors and constitute important genetic
mechanisms for cancer, cellular senescence, and infertility. Despite their fundamental significance for
chromosome transmission and genome stability, little is known about the levels and patterns of genetic
diversity across centromeres and the PAR or the biological impacts of this variation. The repetitive sequence
content of these regions poses a major barrier to their molecular analysis, and the PAR and centromeres
remain unassembled or incompletely assembled on many of the highest quality reference genomes. My group
has recently developed experimental and bioinformatic tools that will allow us to catalog variation across the
PAR and centromeres, setting the stage for subsequent investigations into the functional consequences of
genetic variation across these loci. Over the next five years, we will combine these analytical tools with diverse
mouse models, cytogenetic investigations of chromosomes, and evolutionary analyses to address three critical
questions. First, what it is the extent of DNA sequence variation across these chromatin domains? We
will combine targeted long-read sequencing, re-analysis of genomic data in public archives, and analyses of
the frequency of specific nucleotide “words” in collections of shot-gun sequenced reads to catalog PAR and
centromere diversity in a mammalian model system, including variation in size, genomic architecture,
nucleotide sequence, and repeat content. Second, how do allelic differences in PAR and centromere
sequences impact their intrinsic chromatin-dependent functions in chromosome segregation and
fertility? We will test explicit hypotheses about how variation at the PAR and centromeres influences fertility
and biases chromosome transmission to quantify relationships between DNA sequence diversity and function.
Third, what mechanisms safeguard the chromatin-based functions of these loci in the face of their
rapid sequence-level evolution? We will explore possible resolutions to this perplexing duality by elucidating
how naïve DNA sequence acquires chromatin-dependent functions using mouse models with spontaneous
PAR expansions. Overall, the success of this project will significantly advance our understanding of diversity,
evolution, and function at two loci with critical biological roles in chromosome segregation that arise not from
products of their DNA sequence, but rather the intrinsic properties of their chromatin.