PROJECT ABSTRACT
Diagnostic errors affect 12 million patients in the U.S. and contribute to 80,000 deaths per year. The main causes
for diagnostic errors include cognitive biases introduced by healthcare providers, miscommunication between
healthcare teams, lack of access to key data, and not recognizing time-sensitive data in the electronic health
record (EHR). The cognitive burden from information overload in the EHR cause clinicians to take decisional
shortcuts with biased heuristics and miss critical data in the EHR, leading to missed opportunities for timely and
accurate diagnoses. Artificial Intelligence (AI) and clinical Natural Language Processing (cNLP) provide
opportunity to help understand medical text and can automate EHR analysis, pointing to the promising direction
of invoking medical knowledge and clinical experience as humans do. However, the majority of the cNLP tasks
are not designed for bedside application to generate diagnoses and augment bedside decision-making. We have
have gathered preliminary data and designed cNLP benchmark tasks for clinical diagnostic reasoning. Our tasks
address key cognitive processes to build models in this proposal that can synthesize EHR data to generate
diagnoses that align with evidence-based medicine and medical knowledge representation. The proposal aims
to develop novel cNLP models that understand and integrate multi-modal EHR data, and conduct reasoning over
a large-scale medical knowledge base to build a model that provides higher accuracy than current neural network
models. I will first develop a multi-modal generative model that reads in both structured and unstructured EHR
data to output diagnoses using a two-stage training process (Aim 1). In a separate aim, I will construct a
knowledge base using a neural symbolic approach from medical concepts and relations sourced from the
National Library of Medicine's Unified Medical Language System (UMLS). The knowledge base will be part of
the model to generate diagnoses given the information from a daily care note collected in the EHR (Aim 2). The
third aim will design and pilot a clinical diagnostic decision support system using human-centered design
principles. The best models from Aims 1 and 2 will be evaluated for diagnostic accuracy by clinicians in the
system using previously validated instruments for patient safety and diagnostic error (Aim 3). Completion of the
aims will inform future clinical studies on developing NLP-driven clinical decision support tools for reducing
diagnostic error. I will complete this project under the direct supervision of my co-mentors and advisors who have
expertise in developing clinical neural language models, implementation of AI-driven tools in health systems,
and clinical decision support systems with augmented intelligence. Together, this multidisciplinary team brings
nationally renowned expertise in clinical informatics with a track record of successful mentorship. My 4-year
proposal with intensive mentorship, clinical research training, formal coursework in health systems engineering
and informatics, and computing resources at the University of Wisconsin-Madison will ensure my success as I
grow into an independent scientist.