Race/Ethnicity-Specific Algorithms of Chronic Stress Exposures for Preterm Birth Risk: Machine Learning Approach - Racial/ethnic disparities in preterm birth (PTB) are persistent in the U.S., with a higher prevalence of PTB in non-Hispanic (N-H) Black women than their N-H White counterparts. However, the underlying mechanism of such Black-White differences is not well understood. Even extensive biomedical, behavioral, and socio- demographic risk factors can explain only about half of PTB incidence. Chronic stress has received significant attention as a robust predictor of PTB, particularly among racial/ethnic minority groups. Nevertheless, literature shows inconsistent evidence on the relationships among race/ethnicity, chronic stress, and PTB, mainly because of the complexities involved in assessing women’s chronic stress exposures. Accurate chronic stress measures should capture the nature of stressors: cumulative, interactive, and population-specific. In this regard, conventional statistical models (e.g., linear regression) have limited ability to model chronic stress exposures with high precision. Thus, this study will adopt machine learning (ML), a state-of-the-art modeling technique, to compute non-linear and synergistic relationships among chronic stressors, detect unknown patterns, and reflect subtle differences in chronic stressors between N-H White and N-H Black women for more accurate prediction of their PTB risk. I will develop simple, accurate, and explainable ML algorithms of chronic stress exposures by building a hybrid algorithm specific to N-H White and N-H Black women and computing SHAP (SHapley Additive exPlanations) values. Specifically, the hybrid algorithm will combine Multivariate Adaptive Regression Splines (MARS) and Deep Neural Network (DNN) algorithms where MARS will select only “important” chronic stressor variables for each race/ethnicity to serve as DNN’s input features for PTB risk prediction. Additionally, a SHAP value for each chronic stressor in the final algorithm will quantify its degree of contribution to the predicted PTB risk. The ML algorithms will be trained and tested on a large national database—Pregnancy Risk Assessment Monitoring System (2012-2017)—collected by 37 U.S. states. The study’s specific aims are to 1) compare the accuracy among logistic regression (LR) and two ML algorithms (DNN and hybrid) of chronic stress exposures to predict PTB risk using area under the receiver operating characteristic curve (AUC); 2) compare the accuracy between race/ethnicity-combined and race/ethnicity- specific models within LR, DNN, and hybrid algorithms; and 3) determine the extent of the importance of chronic stressors to the predicted PTB risk in the best-performing algorithm using regression coefficients (for LR) or SHAP values (for ML algorithm). Career development goals are to 1) develop expertise in stress measurement in the context of maternal and child health, 2) acquire knowledge and skills in ML and the analysis of large-scale data, and 3) cultivate health informatics-focused manuscript and grant preparation skills for independence. Results from this study will contribute to preventing PTB among vulnerable pregnant women via early screening with more accurate, data-informed tools to assess these patients’ chronic stress.