Determinants of susceptibility to pediatric acute myeloid leukemia - PROJECT SUMMARY/ABSTRACT The role of germline genetic variation and viral infection in development and progression has been studied extensively in adult tumors and autoimmune disease. Less attention has been paid to the interaction of these factors with birth defects and pediatric malignancies, particularly acute myeloid leukemia (AML), which in the youngest patients is driven almost exclusively by structural variants (SVs) with poorly understood etiology. The prevalence of gene fusion transcripts associated with leukemia at live birth is 10x to 100x greater than the incidence of childhood leukemia, which suggests other risk factors must interact with SVs. One possible candidate for interaction is the presence of germline mutations, and preliminary analysis has shown that patients harboring SVs are enriched for germline mutations in genes responsible for DNA double-stranded break repair. Another candidate is the timing of viral infection in genetically-predisposed individuals. Clinical trials of gene therapy with viral vectors failed in part due to viral integrations activating oncogenes such as MECOM. Recent work has shown a direct mechanism for derivative chromosome formation at the most common breakpoints in leukemia, and human herpesviruses, including CMV, are one of the single greatest risk factors for chromosomal birth defects. Additionally, germline and somatic copy number and short sequence variants have been documented affecting e26 transformation specific (ETS) factors, which participate in high-risk gene fusions seen in both solid and liquid tumors. These factors and their binding sites determine developmental fates across tissues, yet their motifs are short tandem repeats -- the single most variable class of features in the human genome. Small changes in dosage, as created by disruptions in binding site motifs or variation in ramp sequences, may be sufficient to predispose individuals to disease. The primary obstacle to studying these mechanisms has long been the small sample sizes and biased coverage of cohorts assembled for rare and childhood diseases. The vast quantity of whole-genome, whole-transcriptome, and long-read sequencing data provided by the Gabriella Miller Kids First! (GMKF) Consortium, Therapeutically Applicable Research to Generate Effective Treatments (TARGET), and the X01 Long Read Pilot Project for omics-cold pediatric leukemia patients, and others negate this obstacle. I posit that both the computational infrastructure and the sample sizes required to address this urgent need are now in place, allowing us to determine if predisposition risk can be mitigated by screening or prophylaxis. The overall goal of my F99 training phase is to characterize germline variants and perform functional validation of variants in a zebrafish model of high-risk pediatric AML. During my K00 phase, I propose to characterize germline regulatory, splicing, structural variants, and viral genomic integrations as catalysts of risk for leukemia. The training and data resulting from this fellowship award will establish the foundation of scientific and professional skills for my career as an independent researcher.