Data Analysis Center for Somatic Mosaicism Across Human Tissues Network - PROJECT SUMMARY The goal of the Somatic Mosaicism across Human Tissues (SMaHT) Network is to better characterize the occurrence of mosaic variants in human tissues and to understand their role in the regulation of biological processes that impact health and disease. The SMaHT Data Analysis Center (DAC) aims to collect and curate all data generated in the Network; assess, develop, and apply state-of-the-art analytical pipelines; and produce a variant catalog and a data portal for the scientific community. We have assembled a team of investigators, bioinformatics scientists, data curators, and software developers with a strong track record in mosaic variant analysis, long-read data analysis, data portal development, visualization, large-scale data management and computing, and development of secure and flexible cloud technologies. In Aim 1, we will work with the Network members to define data and metadata standards and ensure that high-quality data are generated, processed, and annotated uniformly and efficiently. In Aim 2, we will perform benchmark studies for the current tools and technologies for identification of mosaic variants and lead an effort to define and implement analytical pipelines on a cloud platform. We will also develop new approaches as needed. We will ensure that a comprehensive set of mosaic variants of all types (single nucleotide variants, indels, copy number variants, translocations, complex rearrangements, transposable element insertions, microsatellite mutations, repeat expansions, etc.) are identified, using short- and long-read platforms, genome-wide and targeted assays, and bulk and single cell technologies. In Aim 3, we will build a user-friendly and interactive data portal containing a variant catalog and featuring a read-level variant browser to enable the scientific community to fully utilize the Network data. We will ensure that all methods and processes are documented for full reproducibility and that all tools and data are freely available to the community. 1