Long-read strategies for elucidating transcriptome complexity and advancing genomic medicine - PROJECT SUMMARY The central objective of this MIRA project is to harness the power of long-read transcriptomics to elucidate transcriptome complexity and advance genomic medicine. Mammalian cells generate remarkable regulatory diversity and complex phenotypes from a finite set of genes. Pre-mRNA alternative splicing (AS) is an essential mechanism for generating this regulatory diversity. Widespread changes in AS occur in both normal and pathological processes, resulting in transcript and protein isoforms that vary in their sequences and functions. However, determining the full-length transcripts that arise from AS on a transcriptome-wide scale has been a long-standing challenge in transcriptomics. Today, transcriptomics is on the cusp of a major technological transformation. While short-read RNA-seq has been the standard approach for transcriptome analysis over the last 15 years, the advent of long-read RNA- seq platforms holds the potential to revolutionize transcriptome research. Long-read RNA-seq enables end-to- end sequencing of full-length transcripts, offering unprecedented insights into transcriptome complexity and its impact on gene products. Furthermore, long-read RNA-seq greatly enhances our ability to identify and interpret splice-altering variants, by providing a complete view of splicing in full-length transcripts and linking mis-spliced transcripts to disease-associated alleles. Despite its potential, long-read RNA-seq remains substantially under- utilized compared to short-read RNA-seq, primarily due to its higher base error rate, lower throughput, and the associated experimental and computational challenges. My lab has a long-standing interest in developing and applying genomic technologies to study RNA processing and regulation. In recent years, we have spearheaded multiple projects to overcome the primary technological hurdles and demonstrate the innovative biomedical applications of long-read RNA-seq. Specifically, we developed experimental and computational approaches to enable robust transcript analysis using long-read RNA-seq. We also developed TEQUILA-seq, a versatile, easy-to-implement, and low-cost method for targeted long-read RNA-seq, achieving ultra-high sequencing coverage for any gene panel of interest. Building on these technological advances, over the next 5 years we plan to: 1) develop technologies to delineate haplotype- resolved, full-length transcriptomes in both bulk tissues and single cells; 2) characterize transcript isoforms of haploinsufficient genes to discover novel therapeutic targets; and 3) establish a program for RNA-guided genetic diagnosis of rare diseases. Overall, my MIRA program combines technology development with innovative applications in genomic medicine. The novel technologies and resources developed from this project will empower researchers to study AS and transcript isoform variation across a wide range of biomedical contexts. The proposed research will also propel us towards our long-term goal of using RNA-based tools to improve the diagnosis and treatment of patients with rare and complex diseases.