Transfer RNAs (tRNAs) are fundamental to life due to their vital roles in protein translation. tRNAs are needed
in large abundance for normal cellular functions, which requires that they are among the most highly transcribed
loci in the genome. Because of their rigidly defined structure and interactions with many other molecules, each
base within tRNA genes is highly conserved. Pre-tRNA transcripts also include leader and trailer sequences,
which have minimal functionality in most cases and are quickly processed out as part of the tRNA maturation
process. Nonetheless, their high levels of transcription can lead to extremely high levels of variation, likely due
to transcription-associated mutagenesis (TAM).
TAM at tRNA loci has important implications. tRNA genes exist in many copies throughout the genome, and
while many of these genes are constitutively transcribed, epigenomic data shows that a majority appear to be
completely inactive. Variation in tRNA gene expression within and between species make annotation of
expression essential for interpreting the potential effects of natural variants in populations. Greater understanding
of tRNA locus variation could enable prioritization of risk loci in genome-wide association studies, as variants in
active tRNA genes could have pronounced fitness consequences. However, annotation of tRNA expression
levels is difficult for many reasons, including post-transcriptional modifications that impede RNA sequencing, as
well as their high levels of redundancy at the gene level. I will develop a predictive classifier, which will use only
DNA sequence data, to infer and annotate tRNA gene expression across mammals.
There are strong evolutionary implications of increased tRNA transcription as well. Virtually all eukaryotic
genomes contain upwards of 200 tRNA genes. Theory predicts that duplicated genes will quickly diverge in
function and sequence, generally by neo- or sub-functionalization. However, these predictions assume low and
equal germline mutation rates among genes. Therefore, elevated mutation rates at tRNA loci may drive the
conservation of hundreds of functionally redundant genes. I will develop an individual-based population genetic
simulator framework, using estimations of the per-locus mutation rates at tRNA genes, as well as their duplication
and deletion rates. I will then compare simulation results to the actual human tRNA distribution to quantitatively
test each component of this model.
Adding additional complexity, modifications to the wobble base position on mature tRNAs often alters tRNAs’
decoding repertoire and are essential for proper translation. Differences in tRNA modification between species
may lead to differences in wobble potential, and thus change codon usage bias. For example, several closely
related Drosophila species exhibit drastic shifts in codon preference despite no changes in tRNA gene copy
number. To investigate the evolutionary influence of anticodon base pairing, I will analyze the relative effects of
modification enzymes and determine their effects on codon preference shifts, using Drosophila as a model.