Revealing Cellular Determinants of LINE-1 Retrotransposition Outcomes - Project Summary Transposable elements (TE) are endogenous mutagens, that contribute to germline and somatic genetic mosaicism and disease formation. Long interspersed element 1 (LINE-1, L1), one TE family that occupies nearly one-fifth of the human genome, is an active, protein-coding retrotransposon that can ‘copy-and-paste’ itself to generate de novo genomic insertions using an RNA intermediate, a process called retrotransposition. Although most genomic copies of L1 (~500,000 copies) are immobile, there are ~100 loci in human cells that are retrotransposition competent. L1 activity is normally suppressed by epigenetic and DNA repair pathways. Conversely, loss of these suppressive mechanisms can lead to L1-mediated insertional mutagenesis, disrupting functions of disease-causing genes. Recent studies suggest that L1 retrotransposition may be a source of DNA damage, promoting chromosome breaks and translocations. Key studies have shown that L1 retrotransposition can lead to complex insertion outcomes (e.g., full-length L1 (~6kb, retrotransposition competent), truncated L1 (varying in length, retrotransposition incompetent), and chromosomal translocations, etc.), yet the frequency of these outcomes and their underlying mechanisms remain poorly understood. To date, only a few hundred de novo L1 insertions have been published, limiting our understanding of these mechanistically complex events. 1) Traditional retrotransposition reporters can only detect insertions >2kb, missing the more common shorter insertions (<0.5kb); 2) Sequencing L1 insertions is difficult due to their variable length (up to 6kb) and sequence similarity; 3) Short-read sequencing often does not inform the entire length of an insertion or allow for the assembly of an L1 inserted at a chromosomal breakpoint. To address these challenges, I am developing innovative long-read sequencing approaches to capture entire L1 insertions. These breakthroughs provide unprecedented opportunity to study factors that determine insertion outcomes and enable me to address these specific aims. Aim 1 (K99) - Delineate the impact of ORF2p in determining L1 insertion outcomes. Aim 2 (K99/R00) - Investigate the roles of DNA replication and DNA repair in shaping L1 insertion outcomes. Aim 3 (R00) - Map L1 insertions in patient samples with genetic disease. These studies will be the first to leverage state-of-the-art long-read sequencing technologies to identify critical factors that affect L1 insertion outcomes. Research in my R00 phase will provide insight into the contribution of L1 mediated genetic mosaicism in genetic diseases.