Computational and experimental approaches for decoding the function and regulation of unconventional RNA translation - ABSTRACT The advent of ribosome profiling (Ribo-seq) has enabled high-resolution measurement of translation at a genome-wide level, known as the translatome. Studies based on Ribo-seq have revealed a complex translational landscape in eukaryotic cells, uncovering translations beyond conventionally annotated events that may potentially yield thousands of new proteins and significantly expanding the known proteome. These unconventional translations involve alternative translation initiation sites (aTISs) within annotated protein- coding regions as well as occur in traditionally non-coding regions, such as 5' UTR, 3' UTR and long noncoding RNAs (lncRNAs). Certain lncRNA-encoded ORFs and novel translational isoforms of known protein-coding genes resulting from aTI have been shown to play significant developmental or physiological roles in evolutionarily diverse species. Despite an increasing appreciation of the importance of “dark” proteome produced by hidden ORF translation in development, physiology and disease, the functions of vast majority of hidden ORFs remain unknown . In addition, translation of hidden ORFs is often initiated at non-AUG start codons. The cis-regulatory elements and trans-acting factors that are important for regulating hidden ORF translation are yet to be elucidated. To address these gaps in knowledge, my overarching goal is to develop and apply advanced computational and high-throughput experimental approaches, in combination with in- depth mechanistic investigations, to systematically identify cryptic translation of hidden ORFs and to unravel their functions, mechanisms, as well as the cis-regulatory elements/trans-acting factors controlling their translation in development, physiology and disease. To achieve this long-term goal, I propose the following three research programs in the next five years: (i) Developing and employing machine learning based approaches for integrative modeling of translation initiation; (ii) Deciphering the trans-acting function uORFs in regulating estrogen- dependent cell proliferation; (iii) Decoding protein-protein interaction (PPI) networks between annotated proteins and the hidden proteins generated by cryptic translation. Collectively, these research programs aim not only to contribute to novel computational tools for modeling translation initiations and for identifying new PPIs, but also to offer biological insights into the function and regulation of hidden ORF translation. My group has developed a computational toolkit, named Ribo-TISH that enables de novo prediction of hidden ORFs and/or identification/quantitative comparison of TIs from different types of Ribo-seq data (Nat Commun 2017). More recently, we have leveraged this tool and integrative approaches to decode the function and mechanism of lncRNA-encoded hidden ORFs (Nat Struct Mol Biol 2023; J Clin Invest, 2023). Given our expertise/track record in computational biology, functional genomics, and RNA biology as well as a diverse network of collaborators with complementary expertise, we are ideally situated to tackle the proposed research.