Early treatment of autism spectrum disorder (ASD) and attention-deficit hyperactivity disorder (ADHD) can
improve long-term outcomes, but only when at-risk children are identified promptly and accurately. Early risk
factors common to both conditions (e.g. premature birth, perinatal complications) are collected during routine
care and are accessible in the electronic health record (EHR), but have not been used at scale to stratify risk or
improve early screening. The central premise of the proposed research is that the EHR contains
information that can help identify children at risk for ASD and ADHD early in development, but
translating this information into clinically actionable measures requires new predictive modeling methods.
Our recent NIMH-supported research provides initial support for this premise. We have discovered that
children later diagnosed with ASD and/or ADHD have distinctive patterns of early health system utilization, and
that EHR data acquired by age 1 already contains information predictive of later ASD risk. Building on this
work, the proposed research will develop and deploy ASD and ADHD risk prediction models that can
directly inform clinical decision-making. First, we propose to develop methods and prediction models that
will optimize prediction performance, mitigate known sources of bias, and intelligently surveil model-predicted
risk by balancing benefits of earlier identification against the value of waiting for more data. Second, we
propose to take critical steps toward clinical use by prospectively evaluating these models, deploying them in
the Duke University Health System (DUHS) EHR, and working with providers to develop and evaluate a
prototype EHR dashboard that presents ASD and ADHD risk and projected follow-up length to inform clinical
decision-making. Career development goals will allow the candidate to develop expertise needed to establish
an independent, transdisciplinary research program in machine learning for pediatric mental health. Career
training will be tightly integrated with the proposed research and emphasize deployment, evaluation, and
implementation of risk prediction models for ASD and ADHD. In the proposed study, DUHS EHR data will be
accessed through a high-quality, empirically validated data pipeline for all children born after 10/1/2006. Data
elements will include diagnosis and procedure codes, lab measures and vitals, clinical notes, and encounter
details. Diagnoses will be identified using ICD-10-based criteria previously validated at DUHS. The study will
begin with creation of a secure, regularly updated diagnosis risk prediction database to facilitate a smooth
transition to EHR deployment. Activities will focus on methods and model development in years 1-2 followed by
development, deployment, and evaluation of the risk prediction dashboard beginning in year 3. This research
will dovetail with concurrent and subsequent work exploring novel ASD screening methods and model-guided
interventions at DUHS, including patient- and provider-centered early intervention strategies. The proposed
career development will support the PI’s transition to independence and health data science initiatives at Duke.