Machine Learning Methods to Develop and Deploy Real-Time Risk Surveillance for Autism Spectrum Disorder and Attention Deficit Hyperactivity Disorder from the Electronic Health Record - Early treatment of autism spectrum disorder (ASD) and attention-deficit hyperactivity disorder (ADHD) can improve long-term outcomes, but only when at-risk children are identified promptly and accurately. Early risk factors common to both conditions (e.g. premature birth, perinatal complications) are collected during routine care and are accessible in the electronic health record (EHR), but have not been used at scale to stratify risk or improve early screening. The central premise of the proposed research is that the EHR contains information that can help identify children at risk for ASD and ADHD early in development, but translating this information into clinically actionable measures requires new predictive modeling methods. Our recent NIMH-supported research provides initial support for this premise. We have discovered that children later diagnosed with ASD and/or ADHD have distinctive patterns of early health system utilization, and that EHR data acquired by age 1 already contains information predictive of later ASD risk. Building on this work, the proposed research will develop and deploy ASD and ADHD risk prediction models that can directly inform clinical decision-making. First, we propose to develop methods and prediction models that will optimize prediction performance, ensure predictions are not affected by disparities in utilization, and surveil model-predicted risk over time to identify children as early as possible without loss of accuracy. Second, we propose to take critical steps toward clinical use by prospectively evaluating these models, deploying them in the Duke University Health System (DUHS) EHR, and working with providers to develop and evaluate a prototype EHR dashboard that presents ASD and ADHD risk and projected follow-up length to inform clinical decision-making. Career development goals will allow the candidate to develop expertise needed to establish an independent, transdisciplinary research program in machine learning for pediatric mental health. Career training will be tightly integrated with the proposed research and emphasize deployment, evaluation, and implementation of risk prediction models for ASD and ADHD. In the proposed study, DUHS EHR data will be accessed through a high-quality, empirically validated data pipeline for all children born after 10/1/2006. Data elements will include diagnosis and procedure codes, lab measures and vitals, clinical notes, and encounter details. Diagnoses will be identified using ICD-10-based criteria previously validated at DUHS. The study will begin with creation of a secure, regularly updated diagnosis risk prediction database to facilitate a smooth transition to EHR deployment. Activities will focus on methods and model development in years 1-2 followed by development, deployment, and evaluation of the risk prediction dashboard beginning in year 3. This research will dovetail with concurrent and subsequent work exploring novel ASD screening methods and model-guided interventions at DUHS, including patient- and provider-centered early intervention strategies. The proposed career development will support the PI’s transition to independence and health data science initiatives at Duke.