Project Summary
Aberrant RNA modifications, especially methylations and pseudouridinylations, have been correlated to major
diseases like breast cancer, type-2 diabetes, and obesity, each of which affects millions of Americans. Despite
their significance, the available tools to reliably identify, locate, and quantify RNA modifications are very limited.
As a result, we only know the function of a few modifications in contrast to the more than 100 RNA modifications
that have been identified. Mass spectrometry (MS) is an essential tool for studying protein modifications, where
peptide fragmentation produces “ladders” that reveal the identity and position of modifications. However, a similar
approach is not yet feasible for RNA as in situ fragmentation techniques that provide satisfactory sequence
coverage do not exist. One way to circumvent this issue is to perform prior chemical degradation so that well-
defined mass ladders can be formed before entering the spectrometer. However, the structural uniformity of
ladder sequences generated by the prerequisite RNA degradation is unsatisfactory, complicating downstream
data analysis. We have spearheaded the development of a two-dimensional LC/MS-based de novo RNA
sequencing tool by taking advantage of predictable regularities in LC separation of optimized RNA digests to
greatly simplify the interpretation of complex MS data. This method can simultaneously sequence up to three
distinct RNAs of up to 30 nucleotides, as well as identify, locate, and quantify a broad spectrum of modifications
in the RNA sample. We hypothesize that this MS-based RNA sequencing method could be further optimized to
become a robust, easy-to-use, and broadly-applicable de novo sequencing approach, and that such a platform
would be a highly useful and innovative tool that can complement existing next-generation RNA sequencing
protocols for in-depth functional study of chemical modifications carried by endogenous RNAs. In this application,
we propose to (a) reduce the RNA loading amount to a minimum threshold at which de novo sequencing of
endogenous RNAs becomes practicable (Aim 1), (b) develop a streamlined data analysis/sequencing generation
algorithm that will enhance the robustness of our sequencing method (Aim 2), and (c) provide proof-of-concept
examples of the method’s usage in de novo sequencing of endogenous RNA samples (Aim 3). The proposed
work is significant because it will bring the power of MS-based laddering technology to RNA, thus providing a
method comparable to analysis of peptide modifications in proteomics that can reveal the identity and position
of various RNA modifications. This project is highly innovative as successful accomplishment of the proposed
work will 1) allow the MS-based platform to routinely sequence cellular RNA automatically and in a de novo
fashion, 2) broaden its utility across a wide range of applications from research to biotech industries, and 3)
eliminate the need for complementary DNA strand synthesis and permit the establishment of a complete,
unambiguous spatiotemporal and quantitative profile for a wide variety of structural modifications in RNA
samples.