Clarifying the relationship between air pollution and lung and breast cancer with a spatial data science approach - There is growing evidence connecting certain components of air pollution (AP), such as particulate matter less than 2.5 microns or nitrogen dioxide, and breast and lung cancers in the United States (US). This evidence represents an identified link between highly unavoidable environmental exposures and cancer risk because the air we breathe is ubiquitous. Additionally, the scale of the problem of AP is considerable as it is responsible for approximately 6.5 million deaths per year globally. However, in contrast to the design of most previous research, people do not experience one component of AP in isolation but are exposed to it from multiple sources at once and over their life course. To date, there has been no systematic assessment of the totality of AP exposures on cancer risk in the US. Further development of advanced data science and analytics methods is required to understand how fine-scale AP exposures affect cancer outcomes in a potentially spatially varying manner. The proposed Pathway to Independence Award for Early-Stage Postdoctoral Researchers will provide valuable and interdisciplinary training to Dr. Boyle and will position him as a leader in spatial data science for cancer risk with special emphasis on Big Data sources and environmental exposures. The mentored phase of this proposed research will consist of training in (1) toxicological skills to harmonize AP measurements, (2) the biological and epidemiological background of cancer through coursework, (3) Big Data management for AP database querying and processing, (4) causal inferential techniques for estimating AP mitigation policy efficacy, and (5) grant writing and professional development. The research component will leverage the received training skills to answer the public health questions motivated above: (Aim 1) Design comprehensive, spatially-driven pollutant exposure metrics for census geographies across the US, incorporating information from the Risk-Screening Environmental Indicators and Criteria Air Pollutants databases and aggregating by race/ethnicity/socio-demographic characteristics to identify differences in comprehensive exposure for top candidate locations for mitigation; (Aim 2) Develop novel spatial statistical models to test the associations of these metrics with county-aggregated incidence/mortality rates of breast/lung cancers, in order to identify where AP exposure is most linked with cancer risk; (Aim 3) Create new spatial causal inferential models to estimate the effects of strategies such as reducing certain releases or facility clustering, or increasing targeted cancer screening uptake on cancer rates nationwide. Combined, the training received and research performed in this proposal will provide Dr. Boyle with a strong basis for an independent line of research and subsequent R01 proposals focused on environmental exposures, spatial analysis, data science techniques, and mitigation of cancer risk. Importantly, this proposal is supported by an experienced and well-funded team of mentors and scientific advisors who will support Dr. Boyle in his proposed training and research activities and also by the collaborative research environment at Massey Comprehensive Cancer Center and Virginia Commonwealth University.