Long-read genome sequencing for the discovery of highly penetrant variation in rare diseases - Project Summary/Abstract The goal of this proposal is to improve upon current methods to identify genetic contributors to rare diseases, especially neurodevelopmental disorders (NDDs) in children. Finding such variants is both of fundamental biological value and has potential clinical relevance to the affected individuals and their families. To improve upon current approaches, a DNA sequencing platform from Pacific Biosciences, called “HiFi”, will be used. Preliminary data suggest that HiFi can reveal disease-relevant genetic variants that were missed by standard genome sequencing approaches. Genomes of 500 affected individuals will be sequenced using HiFi, and for 200 of them, their parents will also be sequenced (900 total genomes sequenced). This process will be optimized to balance data quality, accuracy, and costs. Comprehensive maps of genetic variation and de novo genome assemblies will be generated for each individual using a variety of methods, which will be systematically benchmarked and improved. These results will be analyzed to identify variants in the proband that may be causally related to their symptoms using commonly accepted standards for clinical interpretation of genomic data. Variants found to be relevant for a particular proband’s symptoms will be validated with orthogonal testing and returned to families by a genetic counselor, if such return is deemed appropriate by the referring clinician. This study will address a major challenge in human genetics, and potentially lead to a fundamental change in the way human genome sequencing is performed in both research and clinical settings.