PROJECT ABSTRACT
Changes in gene expression are a critical force driving the evolution of form and function. As a result, the
mechanisms by which gene expression evolves have become a subject of intense interest. A number of
investigators have measured evolutionary changes in gene expression at the chromatin, mRNA, and protein
levels, leading to new insights about how species evolve. However, studies of different stages in gene expression
have not always been in agreement. For example, the extremely rapid rates of evolutionary changes that have
been widely observed at enhancers appear to conflict with the slower rates observed in mRNA abundance. We
recently completed a major comparative study in primates showing that this disparity between enhancer and
mRNA evolution, in part, reflects extensive compensation at enhancers that jointly determine transcription at
target genes. Likewise, related findings have demonstrated that post-transcriptional changes buffer protein
abundance to relatively more common differences in mRNA expression. Together, these recent findings
demonstrate that evolutionary changes affecting multiple stages of transcriptional regulation often have
interdependent effects on gene expression.
Here we propose to determine how stages early during transcriptional regulation work in concert, either
to conserve gene expression through compensatory changes across stages, or, in rare cases, to change mRNA
in ways that alter organism phenotypes. Our central hypothesis is that interactions between stages are common,
especially at long evolutionary time-scales. To test this hypothesis we propose an ambitious plan to collect rich
genomic data profiling distinct stages of gene expression in two cell types and three tissues from nine mammalian
species. We have focused our study on several early rate-limiting steps in mRNA production, using molecular
assays selected to provide orthogonal sources of information about chromatin architecture (Hi-C/Hi-ChIP),
accessibility (ATAC-seq), transcription (PRO-seq), and mRNA levels (RNA-seq). This project will produce the
largest resource of genomic data uniformly collected across mammals to date. We will integrate genomic data
using a suite of new computational tools, which together will provide a new understanding of how distinct
regulatory stages work in concert during regulatory evolution.