varCUT&Tag: A Method for Simultaneous Identification and Characterization of Sequence Variants in Regulatory Elements and Genes - PROJECT SUMMARY Most methods for identification of sequence variants focus either on broad genomic coverage using whole genome sequencing (WGS), typically at depths of ~50-fold, or higher depth sequencing of coding regions (the exome) after enrichment using oligonucleotide hybridization strategies. However, typical WGS strategies miss most rare (< 5%) sequence variants, while WGS at depths required to identify rare variants can be cost- prohibitive. Exome sequencing approaches can provide high coverage of coding regions at reasonable cost but miss the vast majority of mutations that occur outside of coding regions. Besides coding regions, genetic variation in cis regulatory regions (CREs) can have a major impact on cell and tissue function by altering expression of critical genes. No general enrichment strategies for high depth CRE sequencing have been described because different regions of the genome act as CREs in different cell types, necessitating different sequences to be enriched in each cell lineage. Drawing on the fact that CREs are marked by the same set of epigenetic marks in different cell types, we will leverage recent epigenetic profiling technologies we created to develop a new method called varCUT&Tag for high coverage sequencing and identification of rare variants within CREs and gene bodies (UG3 phase). A critical feature of our approach will be the simultaneous identification of epigenetic marks surrounding each variant, enabling functional characterization of the effects of somatic mutations on CRE function. We will also develop a single cell version of this approach, which includes a single cell RNA sequencing component, allowing the effects of sequence variants on gene expression to be identified. As a result, we will be able to not only identify rare CRE variants in diverse cell and tissue types, but we will simultaneously prioritize variants that alter CRE function for future studies. Using this approach, we will work within the SMaHT network to identify and characterize CRE variants from numerous human tissues, uncover the effects of variants on tissue heterogeneity and cell lineage (using scVarCUT&Tag), and engage in collaborative studies within the Network to increase variant discovery (all UH3 phase). In sum, varCUT&Tag will fill a major gap in our understanding of somatic sequence variants and provide mechanistic insight into their effects in diverse human tissues.