Project Summary/Abstract
About 2% of the human genome encodes proteins, and the vast majority of cellular RNA is non-coding
(ncRNA). Mounting evidence indicate that ncRNA could fold into complex structures, and play critical roles
in cellular physiology and diseases. After depletion of the abundant ncRNA such as transfer RNA and
ribosomal RNA, many cellular ncRNA was found to be 3’ modified with a polyA tail (pA ncRNA), similar to
messenger RNA. However, a significant portion of cellular ncRNA was found to lack the polyA tail (non-pA
ncRNA) or considered to be bimorphic (exist in both pA and non-pA form). The sequence, structure and
biological function of non-pA ncRNA and bimorphic ncRNA remain largely unknown, and technical limitations
are the main roadblocks for scientists to explore this largely uncharted fraction of the human transcriptome.
In this proposal, Dr. Shaw aims to develop a novel nanopore sequencing technique that enables the direct
sequencing and secondary structure detection of full-length non-pA ncRNA. This technique will utilize novel
bacterial and eukaryotic reverse transcriptases (RTs) to capture native non-pA ncRNA, and discreetly thread
the captured RNA through a Mycobacterium smegmatis porin A (MspA) nanopore sequencer for concurrent
RNA sequence and RNA secondary structure detection. The development and application of this technique
will further extend the current RNA sequencing tool box and bring scientists one step closer to fully
understanding the function of the human transcriptome. The specific aims of this proposal are: (Aim 1) Dr.
Shaw will establish the experimental framework necessary for the single molecule characterization of RTs,
nanopore sequencing of RNA, and the computational tools needed to accurately transform nanopore ion
current to RNA sequence. (Aim 2) Dr. Shaw will perform screening of bacterial and eukaryotic RTs in search
for the most robust RT to ratchet RNA through nanopore with minimal ratcheting defects and optimal
sensitivity to RNA secondary structures. (Aim 3) In his independent research phase, Dr. Shaw will first
validate the robustness of his novel technique by sequencing RNA mixtures with well-defined sequences and
secondary structures. He will then apply his technique to profile the sequence and structure of non-pA ncRNA
extracted from HeLa-S3 cell line. Finally, he will further adapt his technique to be compatible with chemical
methods that can directly probe native RNA secondary structures in vivo, such as dimethyl sulfate-
sequencing. During the K99 career development stage, Dr. Shaw will conduct research under the mentorship
and support from Dr. Carlos Bustamante (single molecule studies of molecular motors), Dr. Susan Marqusee
(single molecule biophysics), Dr. Kathleen Collins (reverse transcriptase engineering), and Dr. Jens
Gundlach (nanopore sequencing). This multidisciplinary group of experienced advisors and the outstanding
scientific milieu of UC Berkeley will provide Dr. Shaw with the comprehensive training needed to achieve all
aims of the proposal, and to establish his independent research career as principle investigator.