Creating an initial ethics framework for biomedical data modeling by mapping and exploring key decision points - Project Summary
Biomedical data science data modeling is relevant to a plethora of informatics research activities, such as
natural language processing, machine learning, artificial intelligence, and predictive analytics. As Electronic
Health Record systems become more advanced and more mature, with the potential to incorporate a wide and
diverse array of data from genomics to mobile health (mHealth) applications, the scope and nature of the
biomedical data science questions researchers ask become broader. Concomitantly, the answers to their
questions have the potential to impact the care of millions of patients—getting the answers right, proactively, is
high stakes. However, in data modeling currently, there is no bioethics framework to guide the process of
mapping key decision points and recording the rationale for choices made. Making data modeling decision
points, as well as the reasoning behind them, explicit would have a twofold impact on improving biomedical
data science by: 1. Enhancing transparency and reproducibility and maximizing the value of data science
research and 2. Supporting the ability to assess decision points and rationales in terms of their most crucial
ethical ramifications. Research in this area is particularly timely amid the interest in, and enthusiasm for,
leveraging Big Data sources in the service of improving patient population health and the health of the general
public. The National Institutes of Health (NIH) recently released a strategic plan for data science; there is no
better time than now to create an initial bioethical framework to inform common data modeling decision points.
The improvements in data quality that will derive from decision point mapping and bioethical review will
enhance efforts to apply data models across a range of high-impact areas, from predictive analytics to support
clinical decision-making to robust trending models in population health to better inform local, regional, and
national health policies and resource allocation. To develop this initial bioethics framework, we will use well-
established qualitative research methods (interviews, focus groups, and in-person deliberation) to map the
decision points in biomedical data modeling research and document the rationales invoked to support those
decisions (Aim 1 key informant interviews); assess those data science decision points and decision-making
rationales for their bioethical ramifications (Aim 2 focus groups); and create an initial bioethics data modeling
framework (Aim 3 deliberative meeting). This study would be the first to provide a bioethics framework to meet
a critical gap in biomedical data modeling activities, where the downstream consequences of developing data
models without careful and comprehensive review of ethical issues can be severe. This approach directly
supports core scientific values of inclusivity, transparency, accountability, and reproducibility that, in turn, foster
trust in biomedical data modeling output and potential applications, whether local, national, or global.