Integrative transcriptomics to uncover functional elements and disease-associated variants in RNA - PROJECT SUMMARY: There is a need for integrative data analyses that anchor transcriptomic research in contexts predictive of human health, as illustrated by growing awareness of disease-associated synonymous transcript variants and RNA biotechnologies such as mRNA vaccines. To help uncover sequence features that are important for RNA regulation, we present context-dependent models of translational efficiency, a key metric of transcript function. We show that position-dependent codon usage bias (PDCUB) identifies start codons among AUGs more consistently than the Kozak sequence, while high-PDCUB transcripts are enriched for medically important genes tied to human development and neural function. Attention-based transformer networks and interpretation techniques will independently predict translational efficiency in human transcripts, with comparison to ribosome profiling and RNA abundance data in multiple human cell lines, to characterize how PDCUB and other sequence features guide translational efficiency across health- critical contexts. Transfection assays validate the roles of predicted sequence features. Beyond sequence, higher-order structures also drive RNA function and stability, including translational regulation and interactions with microRNAs and RNA-binding proteins (RBPs). A new RNA structural alignment method and associated clustering will uncover structural domains and group them by mutual similarity to find common structural motifs that impact RNA structure-function relationships, improving our understanding of the role of transcript structure in pathogenesis. Evaluation will consist of clustering RNA families in our previously built RNA structure meta-database, bpRNA-1m, with identified structural domains analyzed in the context of ribosome profiling data to characterize the role of these domains in regulating translation. Meanwhile, clustering structures according to RNA-protein crosslinking data will let us identify motifs involved in the binding of RBPs. Finally, a comprehensive transcriptome browser and meta-database will integrate transcriptomic data for known and new transcript-level features, including those described above. Easy to access and use, this resource will enable scientific and medical researchers to find and define RNA sequence features and structural motifs. By cohesively cataloging the complex facets of transcript-level interactions, along with sequence and structural features relevant for transcript regulation, our transcriptome browser will help researchers visualize ribosomal occupancy, examine RNA structures, microRNA and RBP binding, catalog splice variants, and understand the sequence features that drive transcript interactions. Allelic variants mapped to RNA transcript positions will be combined our annotations, along with feature-based machine learning predictions incorporated into the browser, to assist researchers in generating first-pass predictions of transcript variants and interpreting their outcomes in the context of human health.