Establishing and benchmarking advanced methods to comprehensively characterize somatic genome variation in single human cells - Abstract Understanding somatic genomic variation presents unique challenges, primarily stemming from the individual rarity of most somatic mutations across cells in a multicellular organism. Hence, both sensitivity and accuracy (due to the need to distinguish somatic variation from noise) become crucially important. The Analysis of bulk DNA, even with ultraprecise approaches, only ascertains a portion of the human genome. The analysis of single cells, either by cloning or in vitro whole- genome amplification (WGA), enables discovering theoretically all mutations in a cell independent of their frequency in bulk. However, amplifying single cell genomes in vitro represents still a significant challenge in terms of accuracy of amplification. The novel PTA technique (primary template directed amplification) offers substantially improved quality of amplified DNA. However, PTA produces a relatively small amount of DNA fragments of moderate length. This limits the application of long read sequencing. Long read sequencing is expected to be the most comprehensive approach to somatic mutation detection. In the proposed project, we will, first, perform long-read sequencing in single cells cloned via the production of iPSC lines to study somatic mutation of all types using non-enzymatically amplified genomic DNA, from telomere to telomere, and generate a gold-standard benchmarking resource for methods development. Second, we will address a significant shortcoming of the analysis of single cell genomes, which is the lack of direct information about the exact type of cell being analyzed, or about potential functional consequences of mutations in that cell. For that, we will benchmark the new ResolveOme method, an expansion of PTA, that can analyze in parallel the genome and transcriptome of a single cell. Third, we will address the challenge of high-throughput analysis of single cells to detect somatic structural variants. Specifically, we will establish and benchmark for SMaHT the Strand-seq method that allows for high-throughput detection and characterization of structural variants (SVs) in single cells. Together, this will address 3 critical needs in the analysis of somatic mutations in normal tissues: comprehensive mosaic mutation discovery, phenotyping the cell harboring mutations and directly assessing functional consequences of mutations, and accurate and high-throughput detection of SVs. Another important aspect of the project will be comprehensive comparative analyses of detected somatic variants across all Aims.