PROJECT SUMMARY
Over 1.5 million US adults are diagnosed with type 2 diabetes (T2D) each year. These newly diagnosed
individuals are at increased risk of developing debilitating complications, including renal disease, strokes, and
myocardial infarctions. However, individuals differ widely in their likelihood of experiencing these adverse
outcomes. Individual risk varies based on a complex interplay between pathophysiology, responsiveness to
treatment, and patient capacity for self-management and making sustained lifestyle changes. Early T2D
glycemic control provides a lasting benefit ("metabolic memory"); therefore, strategies that enable the
effective targeting and tailoring of T2D care starting in the initial period after diagnosis may result in
better long-term health outcomes. Implementing individually-tailored care strategies requires substantially
more effective risk prediction tools than are currently available. This R21 proposal seeks to apply advanced
analytic prediction modeling methods to a rich source of electronic health record (EHR)-derived clinical data
(assessed at the time of initial diagnosis and after a year of standard management) to define individual patient
risk profiles. These patient risk profiles will incorporate differences in disease physiology (e.g., reflected in
factors such as age, BMI, hemoglobin A1c at diagnosis), treatment responsiveness, and early self-
management results (e.g., medication adherence, weekly exercise levels, weight loss). This R21 leverages an
established, well-characterized cohort of adults with incident, newly-diagnosed T2D (n=67,575) within Kaiser
Permanente Northern California. We will apply advanced machine learning-based modeling methods (e.g.,
random forests, LASSO, extreme gradient boosting) to complete the following Aims: 1) Develop and validate a
predictive model using EHR-derived patient data available at T2D diagnosis to identify patients at increased
risk of suboptimal glycemic control over the five years following diagnosis and 2) Modify the Aim 1 model by
incorporating clinical predictors captured during the first year following T2D diagnosis. We hypothesize that the
unique information available at these two time points (i.e., initial diagnosis and after one year of standard
treatment) can be used to individualize both initial and subsequent early care for patients with newly diagnosed
T2D. If successful, this project's results can be applied to support targeted T2D care strategies tailored to each
individual's risk of suboptimal five-year glycemic control (and later micro and macrovascular complications)
based on differences in disease physiology, treatment response, and early self-care. This work will form the
foundation for innovative, pragmatic clinical trials that advance our ultimate goal of providing proactive and
effectively tailored early care that results in better long-term health outcomes for adults with T2D.