Machine learning approaches to personalized cigarette smoking risk prediction in adults trying to quit - PROJECT SUMMARY Cigarette smoking is the leading preventable cause of death and claims more than 480,000 lives in the US each year. Roughly 20 million adults in the US want to quit smoking tobacco. At least 7 million people access digital resources (e.g., Smokefree.gov) on this topic each year. Although evidence supports many tobacco treatment interventions, absolute quit rates remain low and relapse or continued smoking is the most common outcome of most quit attempts, even with treatment. Personalized treatment is a promising strategy that may enhance the effectiveness of smoking therapies and deliver the right support for a particular person at a particular time in the quitting process. Sophisticated, high-dimensional data analytic approaches are needed to account for the complexity and intersectionality of influences on smoking lapses and relapses during quit attempts, and to develop robust and effective personalized risk predictions and interventions. Identifying the dynamic patterns and key features of individuals, events, and contexts that predict smoking during quit attempts could inform treatment delivery in multiple ways (e.g., clinical decision support tools for treatment teams, smartphone quitting apps). This K23 career development proposal aims to enhance Dr. Kaye’s capacity to pursue the important public health goal of improving smoking cessation treatment personalization by teaching him how to apply powerful new machine learning methods to this effort. Specifically, he will apply machine learning methods to rich existing datasets from diverse samples of adults trying to quit smoking. Trials combine extensive assessment of individual differences (e.g., demographics, smoking history, comorbidities) and ecological momentary assessment of time-varying risk (e.g., craving, stress, anhedonia) and protective factors (e.g., quitting self-efficacy, medication adherence). The use of data from multiple clinical trials offering varying treatments (e.g., counseling, medications, smartphone apps) creates opportunities to assess the robustness of smoking risk prediction across treatment contexts. This project will conduct secondary data analysis of 4 smoking cessation clinical trials (total N=3023), using machine learning methods to develop, train, and validate: 1) a long term smoking relapse prediction model based on baseline (pre-treatment) data, and, 2) a smoking lapse risk prediction model based on baseline data and dynamic time-varying risk signals. Dr. Kaye will receive training and mentoring in tobacco treatment (Training Goal 1: Drs. McCarthy, Baker), feature engineering to generate predictors (Goal 2: Drs. McCarthy, Curtin, Bolt, Loh), multiple machine learning methods (Goal 3: Drs. Curtin, Bolt, Loh), and digital just-in-time adaptive interventions for substance use (Goal 4: Drs. Businelle, Curtin). The K23 will generate new knowledge about how to predict near- and long-term abstinence outcomes and inform hypotheses about when, how, and for whom to deliver personalized interventions to people trying to quit smoking. The K23 will also launch Dr. Kaye’s program of research to develop open-source, publicly-available risk prediction models that can be incorporated in future treatments.