Project Summary
The central dogma of molecular biology is that information is transferred from DNA to RNA to protein through
exquisitely fine-tuned enzymology (i.e. enzyme-mediated reaction steps). The first two repositories of information
(i.e. RNA and DNA) have been well-characterized by recent advances in DNA sequencing technology. However,
protein sequencing technology is intrinsically more difficult due to the number of amino acids and inability to
replicate via complementarity. Despite the challenges present, a great deal of healthcare relevant information is
the likely reward.
The translation machinery (i.e. RNA to protein machinery) is incredibly complex and is controlled not only by the
ribosome but a host of other associated proteins. Assigning amino acids based on the cognate genetic code is
thought to be even more complex and possibly more error prone than the transcription process within the cell.
For example, the aminoacyl-tRNA synthetases are particularly important for assigning amino acids to an anti-
codon and have been shown to be easily modified using flexizymes by Ohuchi et al. in 2007. In brief, whether
protein sequences can be modulated by the cell intentionally or by unintentional errors, the translation machinery
is likely producing sub-populations of proteins per a single RNA transcript and these proteins have unknown
effects on health. Secondly, if protein machinery can be hijacked for biomedical applications (e.g., generating
therapeutic protein modifications), protein sequencing is needed in order to monitor the production of such
therapeutic proteins.
The goal of this project is to develop technology (or technologies) that can (1) sequence proteins at the single
molecule level and are (2) scale-able to the high-throughput needs of a commercial instrument. Nanopores are
a convenient single molecule tool that can achieve these goals. Nanopore will be used here to linearize the
flexible polypeptide chains as well as co-localize polypeptide segments in a confined sensing volume (i.e. a
plasmonic hotspot) that will be used for deep-ultraviolet (UV) Raman spectroscopy. Raman spectroscopy, in
general, is useful as a label-free tool to obtain the vibrational spectra of a molecule or sub-section of a molecule.
In recent years, a great deal of work has focused on engineering the excitation volume (i.e. hotspot). The main
challenge associated with this project is slowing down the polypeptide so that the residency inside the hotspot
is long enough to obtain a Raman signal. Recently, our lab has developed a number of protocols for slowing
down molecular translocations within nanopores and will be utilized here for protein sequencing applications.
Using nanopores and deep-UV Raman spectroscopy, proteins will be sequenced by comparing the readout of
Raman spectra with reference spectra and developing a spectral parsing algorithm to match the amino acid
fingerprints to residues residing within the sensing zone of the pore.