Project Summary / Abstract
Autism spectrum disorder (ASD) is a developmental disorder that affects 1 in 54 children in the US (1). The
economic cost of ASD is estimated to be $66 billion per year in the US, from medical care and lost parental
productivity (2). Early diagnosis is crucial since it allows for early treatment and the best long-term outcome.
However, identifying children at high risk for ASD at an early age is challenging due to lack of specialists. To
address this problem, the project's objective is to create health information technology (HIT) using information
in electronic health records (EHR) to support non-expert clinicians in identifying children at high risk for ASD.
The HIT will integrate two components that provide complementary information. The first component will
leverage machine learning algorithms to label EHR of children at high risk for autism. Both traditional and deep
learning, potentially leveraging each other, will be evaluated while systematically tracking quality and quantity
of information in EHR and their effect on performance. The second component will focus on the EHR free text
and identify phenotypic behavioral expressions of diagnostic criteria as defined in the Diagnostic and Statistical
Manual of Mental Disorders (DSM). Rule-based natural language processing will be combined with machine
learning algorithms. For both components, potential algorithm bias will be investigated and corrected or
documented when this is not possible. The HIT will combine results from both components through an intuitive
user interface. Since it is intended to be used as a human-in-the loop decision tool, it will also provide
descriptive data on performance for both components. The final HIT will be developed using rapid prototyping
in interaction with domain experts. It will be evaluated in a user study with representative non-expert clinicians.
The evaluation will compare accuracy, confidence, and efficiency of identifying children at risk for ASD with
and without the HIT by non-ASD experts. It will also systematically focus on the type, amount, quality and
transparency of information provided, and how this interacts with user beliefs about their own expertise as well
as their bias toward machine decisions. Different types of EHR as well as different levels of clinical expertise
will be compared for effects of HIT use.
Preliminary work has been conducted for all components with good results. However, this prior work focused
on version IV of the DSM and used only free text from data rich EHR. The proposed project will expand the
prior work to use DSM-5 criteria, train and develop the algorithms to use structured and unstructured fields in
clinical, representative EHR, and work with EHR from different hospitals to evaluate potential obstacles and
advantages of variability in data.
Using information in EHR, this HIT will provide support especially for non-expert clinicians in their evaluation of
children who may be at risk of ASD. The HIT will support early referrals leading to early diagnosis and therapy.
It will be useful in a variety of different settings where domain expertise is missing.