PROJECT SUMMARY
Most health systems globally were designed to be reactive. The Centers for Disease Control and Prevention
(CDC) of the United States reported that 90% of the nation's $3.3 trillion annual healthcare expenditures are for
people with chronic and mental health conditions. Therefore, preventing diseases is key to improving people's
health and keeping rising health costs under control. The criteria in the preventive care clinical decision support
(CDS) modules in most of the EHR systems are limited to age, gender, and screening intervals. This "one size
fits all" preventive care CDS does not provide any personalized recommendations by considering the risk factors
that relate to a patient's family history, social behavior history, ethnicity, and various chronic disease history.
Social history, including behavioral and environmental determinants, are increasingly recognized as critical risk
factors for many causes of disease, disability, and mortality in the United States. Very little research has been
conducted on applying Natural Language Processing (NLP) techniques and artificial intelligence techniques to
extract information from the preventive care guidelines and EHR data to generate personalized preventive care
recommendations by considering the risk factors. Since most of the risk factors, such as social behaviors are
rarely systematically extracted from the clinical notes, linking this information to preventive care is still very
uncommon. The main objective of this research proposal is to develop a system to generate personalized
preventive recommendations by using information extracted from the preventive care guidelines and the
information, including risk factors extracted from the EHR data. The personalized preventive recommendations
will provide the recommendations as well as rationales based on the EHR data and preventive care guidelines.
Our long-term goal is to automate the integration of various preventive care guidelines with the EHR data to
generate personalized preventive care recommendations, to engage more patients in preventive care, and to
reduce the healthcare cost and improve population health. The innovative NLP methods and deep learning-
based algorithms can be used to extract information from other narrative guidelines so that they to be analyzed
with the EHR data. We will (1) use a proposed EHR component-based data interchange structure to analyze the
extracted information consistently; (2) extract information from the clinical guidelines automatically; (3) extract
the risk factors, such as social behaviors, symptoms and other risk factors from the structured and unstructured
EHR data using innovative NLP processing; (4) evaluate the efficiency, accuracy and usability of the
personalized preventive care system through involving both healthcare providers and patients. We will utilize the
Indiana Network for Patient Care (INPC) - a statewide clinical data warehouse. Our rigorous methods and the
availability of the EHR data make it possible in the future to explore (1) personalized healthcare by considering
risk factors extracted from the EHR, and (2) improved patient engagement in disease prevention and
management by utilizing EHR data.