PROJECT SUMMARY/ABSTRACT
Mammalian cells expend large amounts of energy into generating enzyme-mediated RNA chemical
modifications that can change the base-pairing, RNA structure, or recruitment of RNA-binding proteins, among
other elusive roles. Pseudouridine (¿)-modified mRNAs are more thermodynamically stable, more resistant to
RNAse-mediated degradation, and have the potential to modulate immunogenicity and enhance translation in
vivo. However, ¿ detection is extremely challenging: ¿ modifications do not affect Watson-Crick base pairing
and are indistinguishable from uridine when using hybridization-based methods. Further, since ¿ is an isomer
of uridine, detection using mass spectrometry requires non-quantitative chemical derivatization methods. While
recent studies have shown that RNA modifications can be detected through direct RNA nanopore sequencing
by monitoring basecalling errors, we have recently shown that the accuracy and fidelity of this approach is
relatively low and sequence dependent. Our team has recently used a ligation approach to produce synthetic
mRNA controls that contain single ¿ sites within relevant transcripts mammalian cells. Using these synthetic
controls we performed nanopore-based RNA sequencing and developed computational tools that increase the
accuracy of ¿-calling to 90+%, depending on the specific sequence. We are basing our work on our recent
finding that achieving ¿ quantification requires sequence-specific training using unique signal parameters. The
initial success of our team has laid the foundation to 1) generate an expanded set of barcoded synthetic RNA
constructs that contain single ¿ sites, 2) obtain a rigorous set of quadruplicate nanopore runs with ~50,000
single-molecule reads per construct, 3) develop computational tools to allow highly accurate sequence-specific
¿-calling. We will develop a gold-standard set of synthetic mRNA transcripts as a training molecular set for
quantitative ¿ profiling in direct RNA nanopore sequencing of human transcriptomes. The molecular set will
allow quantitative profiling of hundreds of putative ¿ sites across mammalian samples.
This proposal will serve an unmet need by addressing a critical bottleneck: the lack of available modified RNA
modification gold standards, i.e., RNA molecules that contain a site-specific and structure-specific modification.
In this collaborative project we will develop a complete pipeline for synthesis of gold standard molecules; use
these molecules to measure the nanopore signals that ¿ modifications produce; develop a machine-learning
tool to accurately quantify these modifications; profile site-specific ¿ modifications in various cell lines to obtain
¿-maps that can be used to assess relationships of ¿ modifications with phenotypes.