Among all approaches, messenger RNA (mRNA)-based vaccines have emerged as a rapid and versatile
candidate to quickly respond to virus pandemics, including coronavirus disease 2019 (COVID-19). But mRNA
vaccines face key potential limitations. Researchers have observed that RNA molecules tend to spontaneously
degrade, which is a serious limitation - a single cut in the mRNA backbone can nullify the mRNA vaccine.
Currently, little is known on the details of where in the backbone of a given RNA is most prone to degradation
and design of super stable messenger RNA molecules is an urgent challenge. Without this knowledge, mRNA
vaccines against COVID-19 will require stringent conditions for preparation, storage, and transport. A promising
potential solution is deep learning, a general class of data-driven modeling approach, which has proved dominant
in many fields including computer vision, natural language processing, protein folding, and nucleic acid feature
prediction tasks. In this proposal, Dr. Qing Sun aims to combine deep learning and experiments to predict mRNA
vaccines that are stable at room temperature. By adapting two deep learning techniques including self-attention
and convolutions, she will create interpretable end to end models to predict COVID-19 vaccine secondary
structures directly from sequence information and in the end, she will use a synthetic approach that rapidly
generates mRNA vaccine to validate and further improve their deep learning model. Specifically, the research
objectives of this proposal are: 1) to develop the deep learning model using self-attention and convolution, which
capture long-range dependencies, to predict RNA secondary structures and to train the model using existing
RNA secondary structure dataset with high accuracy and efficiency; 2) to employ transfer learning for mRNA
vaccine stability predictions; and 3) to validate and further improve the model performance using experimental
demand-based mRNA production system. She will produce hundreds of mRNA vaccines sequences and test
their stabilities in the lab to serve as dataset to validate and retrain their model. This project will serve as a
framework for other mRNA vaccine processing for rapid response to pandemics. The secondary structure
prediction knowledge from this proposal will also help characterize natural mRNA and synthetic mRNA for natural
science and engineering purposes.