Investigating the Function of Highly Similar Intrachromosomal Repeats to Genomic Instability and Perturbed Gene Expression in Genetic Disorder - 1. Project Summary De novo and ultra-rare copy-number variants (CNVs) often underlie the genetic etiology of pediatric and neurodevelopmental diseases. As such, CNVs provide opportunities to study critical dosage sensitive genes, as well timing and origin of structural variation formation (SV). SV results from distinct mutational mechanisms, including DNA recombination, replication, and repair-associated processes, each leaving specific genomic scars and identifiable signatures that can be accessed with appropriate sequencing methodologies. We and others have shown that DNA repair mechanisms, such as break-induced replication (BIR) and microhomology-mediated break-induced replication (MMBIR), largely contribute to germline SV formation in genomic disorders as well as somatic events in cancer. The error-prone nature of BIR/MMBIR may lead to SVs characterized as complex genomic rearrangements (CGRs) due to insertions of templated segments at the junctions as well as amplification or deletion of genomic segments concomitantly with inversion formation. Our preliminary data indicate that BIR and MMBIR are prone to occur in genomic regions laden with large repeats, here called highly‐ similar intrachromosomal repeats (HSIRs), often leading to nonrecurrent CNVs that perturb nearby dosage sensitive genes. At least 70 genetic syndromes are known to be caused by nonrecurrent CNVs, but the contribution of HSIRs to the underlying molecular mechanism has not been established. We hypothesize that i) a relevant fraction of de novo nonrecurrent CNVs are generated by BIR on which HSIRs provide substrate for ectopic recombination and template-switching; ii) inverted and direct HSIRs have distinct roles in the formation of such CNVs; iii) genetic diseases caused by nonrecurrent CNVs present highly diverse genomic structure that contributes to variability in gene and disease expression. These hypotheses will be tested by virtue of the following aims: (1) to identify nonrecurrent CNVs in disease cohorts and to investigate the features of repeats at the breakpoint junctions (Aim 1); (2) to investigate whether the genomic structure of pathogenic CNVs at the Xq28 locus contributes to allele-specific phenotypic differences; (3) to define the impact of the genomic structure of pathogenic CNVs to an individual transcriptome with implication for disease expression (Aim 3). In all, we will combine extensive genomic and transcriptomic analysis with robust phenotypic characterization to investigate the molecular properties of pathogenic HSIR-mediated CNVs. This work will fill an important gap in knowledge concerning the role of genomic repeats underlying the formation of SVs. Of particular interest is the establishment of the relative impact of HSIRs on the generation of CGRs. Moreover, we will establish the clinical and biological relevance of HSIR- mediated nonrecurrent CNVs for disease expression. In summary, this application will strongly impact our understanding of human biological processes and disease mechanisms with broad implications for the diagnosis of birth defects, neurodevelopment, and cancer.