PROJECT SUMMARY: Disparities in the health care system are substantial, leading to worse health outcomes
and quality of care for marginalized groups. These disparities reflect that our current health system has an
inequitable equilibrium. Imbedded within health care data are societal biases, including racism and barriers in
access to care for individuals from low socioeconomic backgrounds and rural areas. However, many
algorithmic approaches are inadequate for addressing health disparities because the algorithms do not
evaluate or optimize performance in these groups. Existing tools to ameliorate differential performance for
multiple marginalized groups in realistic health care settings are extremely limited. Our innovative approach to
the data and algorithmic bias problems in health disparities is to create a first-of-its-kind overarching
algorithmic fairness framework for multiple marginalized groups. In the initial phase, we will focus on data
transformations—intervening on the data in order to ‘de-bias’ it to represent a desired equilibrium rather than
reinforcing the unfair equilibrium. The second stage builds novel fair regression estimators to enforce fairness
constraints for prediction. Our goal is to create reusable tools that advance the equitable provision of health
care. We will accomplish this by developing generalizable methodology that follows an ethical pipeline for
algorithms guided by a social determinants of health framework. Our specific aims are to: (1) develop and test
novel data transformation methods that rely on microsimulations for de-biasing health care data, (2) develop
and test new fair penalized regression approaches optimized for multiple groups, (3) test the performance of
the new algorithmic framework for a high-impact primary care application in chronic kidney disease prioritizing
fairness for multiple racial and ethnic groups facing health disparities, and (4) create open-source
computational tools, tutorial vignettes, and a synthetic data resource for reproducible research and
dissemination. The proposed research will yield a statistically innovative reusable algorithmic fairness
framework unifying data transformations and fair regression to reduce health disparities with robust testing in a
chronic kidney disease study of quality of care. This primary care application will leverage rich registry data,
including measurements of social determinants of health, collected in usual care settings from a
geographically, racially, and ethnically diverse population across multiple payers. Our approach centers
robustness with rigorous methodological design, including comparisons to alternative existing estimators and
standard practice in comprehensive simulation studies and national, real-world registry data. Addressing health
disparities in primary care—a hub of continuous, coordinated care—has the potential for substantial impact on
improving public health via the health care system. The broad applicability of our framework and creation of
reusable computational tools will facilitate deployment in many practical settings.