Project Summary / Abstract
Despite the high prevalence of diabetic retinopathy (DR), the recommended annual ophthalmic exam for diabetic
patients has a very low compliance rate, only around 43%. Many patients do not seek proper medical attention
because DR is asymptomatic in the early stage, and thus they miss the most effective period to halt DR
progression and prevent vision loss. Moreover, ophthalmic equipment for DR exams is predominantly limited to
urban areas, restricting access by patients in rural communities with limited incomes. All of these issues create
an urgent need for cost-effective, widely-available approaches that enable early detection of DR.
Our long-term goal is to develop a non-image-based, artificial intelligence (AI) tool for primary care physicians to
assess patients' risk for DR using comorbidity data and routine lab results, which are widely available. It will help
physicians recommend ophthalmic exams and individual screening frequency for at-risk patients confidently.
The accuracy of our approach is close to the fundus-image-based DR detection tools, and it is much easier to
use and more cost-effective. Preliminary studies demonstrated the feasibility of detecting DR with 90% accuracy.
Our approach is promising to increase the compliance rate of the recommended ophthalmic exams among
asymptotic patients, break the barrier to ubiquitous diabetic eye care in rural communities, and save thousands
of people from blindness. If successful, our approach has the potential to transform future DR care from reactive
to proactive. It will identify the causative and clinically modifiable factors of DR. This will lead to a proactive DR
prevention and management tool to reduce avoidable DR and defray healthcare costs.
As the next step in pursuing our long-term goal, we will develop predictive models for DR and extract training
data from Cerner Health Facts, a comprehensive, relational database of real-world, de-identified, HIPAA-
compliant patient data. However, similar to other electronic-health-record (EHR) databases, its quality suffers
from missing values, imbalanced and unlabeled data. In addition, although EHR data are multi-dimensional, due
to technical challenges, they are often examined in two-view features (either longitudinal or cross-sectional).
Thus the high order statistics (correlation information) are not well utilized in healthcare analytics.
Tensor information is important to optimize medical decision making and provides a unique angle to address the
problems of missing, imbalanced, or unlabeled data. The progression of a disease or the outcome of treatment
not only depends on the patient's current health conditions, but also his or her medical history. To realize the full
potential of EHR data, this project will study novel imputation, augmentation, classification, and machine learning
techniques by simultaneously handling the longitudinal information. The methodology developed from this study
will help improve the quality of EHR data and the accuracy of the predictive models for a wide range of diseases.
Project Summary/Abstract Page 6
Contact PD/PI: Liu, Tieming
Narratives
Although diabetic retinopathy (DR) is the leading cause of blindness among American adults,
many diabetic patients do not comply with the recommended ophthalmic exams because DR is
asymptomatic in the early stages, and thus patients miss the most effective period to halt DR
progression and prevent vision loss. To improve the compliance rate of the recommended
ophthalmic exams and detect DR early, our long-term goal is to develop a cost-effective, non-
image based, artificial intelligence (AI) tool for primary care physicians to assess patients’ risk for
DR using routine lab results, and recommend ophthalmic exams and personalized screening
frequency for at-risk patients confidently. As the next step in pursuing this goal, this project aims
to develop advanced machine learning algorithms to realize the full potential of electronic-health-
record (EHR) data by harnessing tensor information to improve the quality of EHR data and
prediction accuracy.