A Scalable, Open-Source Generative LLM Tool for Automated Classification of Diagnostic Errors - 1 Medical errors are the third leading cause of death in the United States yet estimates of their total 2 burden and epidemiology remain largely unknown, with few comprehensive assessments 3 available. To address this gap, we propose leveraging the Retract-and-Reorder (RAR) method, 4 an existing health information technology (IT) tool that detects near-miss, self-caught order errors, 5 to better understand the underlying causes of medical errors. The RAR method has been reliably 6 used to detect wrong-patient and certain types of medication prescribing order errors. We 7 expanded its application to diagnostic imaging, identifying additional error types such as wrong- 8 site, wrong-contrast, wrong-side, and wrong-modality, using logic-based natural language 9 processing (NLP). However, over 42% of detected errors remained unclassified, requiring labor- 10 intensive manual review for further categorization. In this proposal, we aim to develop a scalable 11 pipeline that automatically classifies order errors and addresses unknown error types using 12 generative large language models (LLMs). To accomplish this, we will first (AIM 1) develop and 13 validate a generative LLM-based classification model for categorizing RAR events into predefined 14 error types, focusing on imaging order errors. We will compare its performance against the current 15 logic-based NLP approach, hypothesizing that the LLM will achieve equal or better accuracy by 16 correctly classifying known error and identifying previously missed error types, thereby improving 17 overall classification. Then, we will (AIM 2) demonstrate the scalability of the LLM pipeline by 18 applying it to medication order errors and developing a dissemination plan. We hypothesize that 19 LLMs can be readily adapted to diverse large sets of order types across various domains without 20 requiring fine-tuning. This study will establish the feasibility of developing an advanced, 21 automated, and scalable open-source tool for classifying and characterizing RAR events across 22 different medical orders. By identifying and understanding various order error types across 23 domains, this research will support the development of measures and targeted interventions to 24 improve patient safety. Furthermore, our privacy-preserving approach, achieved by deploying an 25 open-source LLM along with comprehensive documentation and structured dissemination, will 26 enable adoption across institutions and diverse healthcare settings. Beyond imaging and 27 medication orders, this framework could support cross-institutional implementation, facilitating its 28 expansion into other order domains.