Comprehensive Pediatric Phenotyping for Evidence-Based Diagnosis in Genetic Disease - To facilitate the diagnosis of among 7000 rare genetic diseases, clinicians have developed diagnostic criteria that enumerate different elements that define disease. These include medical problems, physical exam findings, laboratory test results, and imaging findings. However, most clinical diagnostic criteria have unknown predictive value. Despite being critical for diagnosis and provision of genetic testing, they are typically proposed without rigorous evidence or estimates of performance such as sensitivity or specificity. Suboptimal criteria may cause faulty interpretations of genetic testing with variants of uncertain clinical significance or lead clinicians to overlook diagnosis, depriving patients of prognostication, reproductive planning, or targeted molecular therapies. Our previous work has delineated an approach to more evidence-based rare disease criteria. We developed novel clinical criteria for nevoid basal cell carcinoma syndrome using survey data and statistical optimization, and we estimate the novel criteria have improved sensitivity compared to the existing expert consensus criteria, particularly at early ages (53% versus 13% at 7 years). My central hypothesis is that diagnosis of rare pediatric genetic disease can be improved by utilizing evidence-based diagnostic approaches. Moreover, such approaches may be one avenue to address inequities in the provision of genetic referral and testing among individuals belonging to historically marginalized groups. Therefore, I will scale our previous work across the spectrum of rare genetic diseases using comprehensive, clinician-validated phenotype information to establish and test diagnostic methodologies. To address this hypothesis and progress towards my long-term career goal of becoming and independent physician-scientist that advances accurate and timely diagnosis for all children with a rare genetic disease, I have developed a comprehensive five-year career development plan. This plan delineates a strategy to gain knowledge and experience with natural language processing and machine learning, human-centered design and human factors, and electronic health record intervention. Using these new skills, I will create comprehensive, chronological phenotype histories for over 37,000 children with suspected or confirmed genetic disease. I will embed a tool in the clinical workflow that elicits clinician validation of these phenotypes. From these data, I will implement a framework to develop and validate diagnostic criteria in genetic disease. I will initially focus on 10 specific diseases. I will also develop computationally tractable machine learning algorithms to aid in diagnosis at scale. Next, I will develop a web-based user interface to empower other clinicians to develop and test their own diagnostic criteria. Finally, I will apply the same phenotyping and machine learning approaches at the health system level to predict which children are more likely to be diagnosed with a rare genetic disease. These endeavors will generate a foundation to establish my long-term research program that will implement clinical decision support for genetic diagnosis and prepare me to become an independent R01-funded investigator.