Abstract. Delirium, or acute confusional state, affects 30-40% of hospitalized older adults, with the added cost
of care estimated to be up to $7 billion. Although originally conceptualized as a transient disorder, delirium is now
recognized to have significant consequences, including increased risk of death, functional decline, and long-term
cognitive impairment. As up to 75% cases are not recognized by providers, there is an urgent need for additional
methods to identify delirium for clinical and research purposes, and to stratify patients based on delirium risk. In
this proposal, we present a novel approach to the identification of delirium based on large-scale data mining (i.e.,
pattern recognition) algorithms using machine learning and natural language processing applied to electronic
health record (EHR) data, which will automate chart-based determination of delirium status and risk prediction.
We will combine these algorithms with data collected through our recently implemented Virtual Acute Care for
Elders (ACE) quality improvement project, which institutes delirium screening once per shift by nursing staff for
all individuals over age 65 admitted to the University of Alabama at Birmingham (UAB) Hospital. This unprece-
dented volume of data will allow us to achieve the necessary sample sizes for effective training and validation of
our data mining algorithms. Data mining algorithms that discover patterns of associations in data, rather than
testing predetermined hypotheses, are well suited to application in large-scale algorithms for identification of
delirium. Using our Virtual ACE and hospital EHR data, we will be able to evaluate more than 10,000 individual
features (e.g., text words and phrases, laboratory and other diagnostic tests, concurrent medical conditions) as-
sociated with delirium, which will be classified as risk factors for delirium, as signs, symptoms, and descriptors
of delirium itself, and as complications and consequences of delirium, based on expert consensus. We will then
use these features to develop rules for identification of delirium in the EHR, as well as risk prediction models that
can be integrated into the EHR to provide individualized assessments of delirium risk. This study will lay the
foundation for methods of automated delirium identification and risk prediction in healthcare settings that are
unable to implement the screening by providers done in our Virtual ACE, as well as for large-scale epidemiological
investigations of delirium using EHR data, expanding the current armamentarium for studying this common and
debilitating disorder.