Deciphering functions of the LINE-1 ORF2p protein - PROJECT SUMMARY Approximately half of many mammalian genomes are comprised of repetitive sequences, which include Long INterspersed Element-1 (L1). L1 utilizes its ORF2p protein to increase its copy number through a process known as retrotransposition. Despite the abundance of L1 copies in their host genome, many steps in their retrotransposition cycle including the exact functions of ORF2p and how ORF2p accesses chromatin for integration are unknown. The annotated half of ORF2p includes the functional endonuclease (EN) and reverse transcriptase (RT), which are required for retrotransposition. The unannotated half of ORF2p, which includes the Cryptic region and C-terminus, are evolutionarily conserved among L1s from diverse species but lack homology to other cellular proteins. The C-terminus is required for retrotransposition, but it has no known function. Progress in understanding mechanisms underlying ORF2p functions in retrotransposition is impeded by the lack of an ORF2p protein structure and the unknown functions of ORF2p Cryptic and C-terminus regions. Our lab also discovered that L1 sequences contain multiple splice sites, which can impact their host through generation of splice variants with cellular genes. However, how often the thousands of L1 inserts form chimeric transcripts with host genes and how their incorporation influences the cellular transcriptome is unknown. Thus, there is a need for comprehensive analysis of L1 contribution to splice variants in different species. Our lab has generated novel genetic approaches to study the function of the unannotated ORF2p regions in retrotransposition. We previously identified the Cryptic region of ORF2p and proposed for it to be a structural anchor for EN and RT. Studies, including our own, have shown that both Cryptic and C-terminus are essential for retrotransposition. Our preliminary data show that the C-terminus (1) interacts with the proposed structural anchor, Cryptic and (2) interacts with histones. Our preliminary data also show that the L1 3′-end sequence is incorporated into cellular splice variants through alternative splicing events. Based on these data, we hypothesize that the C-terminus of ORF2p is essential for retrotransposition through its intra- and intermolecular interactions and the incorporation of L1 3′end sequence into splice variants impacts the cellular transcriptome. We propose two aims to evaluate these central hypotheses. Aim 1 investigates the sequence requirements involved in C-terminus interactions with Cryptic and core histones and retrotransposition steps requiring these interactions. Aim 2 investigates L1 3′-end sequence incorporation into chimeric splice variants and its impact on gene expression. Our results will determine (1) functional role(s) of unannotated regions within ORF2p and (2) an impact of L1 3′-end sequence on cellular transcriptome. Our results will generate a database of chimeric transcripts containing L1 3′-end sequences in different species, which will be a valuable resource for identifying chimeric proteins containing L1 ORF2p C-terminus. Overall, our findings will contribute to the knowledge of mammalian retrotransposition and describe a novel impact of retrotransposons on cellular transcriptome.