Sunday, May 11, 2025 5/11/2025

Improving missing data analysis in distributed research networks

Award Number: R01HS026214
ORGANIZATION: AGENCY FOR HEALTH CARE RESEARCH AND QUALITY
OPDIV: AHRQ
AWARD CLASS: DISCRETIONARY
AWARD ACTIVITY TYPE: SCIENTIFIC/HEALTH RESEARCH (INCLUDES SURVEYS)

Group Awards By:

View Award Description

Improving missing data analysis in distributed research networks - ABSTRACT Electronic health record (EHR) databases collect data that reflect routine clinical care. These databases are increasingly used in comparative effectiveness research, patient-centered outcomes research, quality improvement assessment, and public health surveillance to generate actionable evidence that improves patient care. It is often necessary to analyze multiple databases that cover large and diverse populations to improve the statistical power of the study or generalizability of the findings. A common approach to analyzing multiple databases is the use of a distributed research network (DRN) architecture, in which data remains under the physical control of data partners. Although EHRs are generally thought to contain rich clinical information, the information is not uniformly collected. Certain information is available only for some patients, and only at some time points for a given patient. There are generally two types of missing information in EHRs. The first is the conventionally understood and obvious missing data in which some data fields (e.g., body mass index) are not complete for various reasons, e.g., the clinician does not collect the information or the patient chooses not to provide the information. The second is less obvious because the data field is not empty but the recorded value may be incorrect due to missing data. For example, EHRs generally do not have complete data for care that occurs in a different delivery system. A medical condition (e.g., asthma) may be coded as “no” but the true value would have been “yes” if more complete data had been available, e.g., from claims data as the other delivery system would submit a claim to the patient’s health plan for the care provided. In other words, one may incorrectly treat “absence of evidence” as “evidence of absence”. EHRs hold great promise but we must address several outstanding methodological challenges inherent in the databases, specifically missing data. Addressing missing data is more challenging in DRNs due to different missing data mechanisms across databases. The specific aims of the study are: (1) Apply and assess missing data methods developed in single-database settings to handle obvious and well-recognized missing data in DRNs; (2) Apply and assess machine learning and predictive modeling techniques to address less obvious and under-recognized missing data for select variables in DRNs; and (3) Apply and assess a comprehensive analytic approach that combines conventional missing data methods and machine learning techniques to address missing data in DRNs. The analytic methods developed in this project, including the extension of existing missing data methods to DRNs, the innovative use of machine learning techniques to address missing data, and their integration with privacy- protecting analytic methods, will have direct impact on the design and analysis of future comparative effectiveness and safety studies, and patient-centered outcomes research conducted in DRNs.


Issue Date FY	Funding FY	Legal Entity Name	Legal Entity Address	Legal Entity City	Legal Entity State	Legal Entity Zip Code	Legal Entity COUNTY	Legal Entity COUNTRY	Assistance Listing	Award Code	Budget Year	Action Date	Action Type	Action Amount

Issue Date FY: 2023 ( Subtotal = -$15,974 )
2023	2020	HARVARD PILGRIM HEALTH CARE INC	1 WELLNESS WAY	CANTON	MA	02021	NORFOLK	USA	Research on Healthcare Costs, Quality and Outcomes	000	3	1/26/2023	NON-COMPETING CONTINUATION	-$15,974
														Subtotal = -$15,974

Issue Date FY: 2020 ( Subtotal = $398,881 )
2020	2020	HARVARD PILGRIM HEALTH CARE INC	93 WORCESTER ST	WELLESLEY	MA	02481	NORFOLK	USA	Research on Healthcare Costs, Quality and Outcomes	000	3	9/9/2020	NON-COMPETING CONTINUATION	$398,881
														Subtotal = $398,881

Issue Date FY: 2019 ( Subtotal = $399,886 )
2019	2019	HARVARD PILGRIM HEALTH CARE INC	93 WORCESTER ST	WELLESLEY	MA	02481	NORFOLK	USA	Research on Healthcare Costs, Quality and Outcomes	001	2	9/4/2019	NON-COMPETING CONTINUATION	$399,886
2019	2018	HARVARD PILGRIM HEALTH CARE INC	93 WORCESTER ST	WELLESLEY	MA	02481	NORFOLK	USA	Research on Healthcare Costs, Quality and Outcomes	000	1	10/26/2018	NEW	$0
														Subtotal = $399,886

Issue Date FY: 2018 ( Subtotal = $400,000 )
2018	2018	HARVARD PILGRIM HEALTH CARE INC	93 WORCESTER ST	WELLESLEY	MA	02481	NORFOLK	USA	Research on Healthcare Costs, Quality and Outcomes	000	1	9/12/2018	NEW	$400,000
														Subtotal = $400,000

Grand Total All Awards = $1,182,793

Top

All Categories

About

Search

Reports

Data Submission

Award Information

Improving missing data analysis in distributed research networks

Award Number: R01HS026214

ORGANIZATION: AGENCY FOR HEALTH CARE RESEARCH AND QUALITY

OPDIV: AHRQ

AWARD CLASS: DISCRETIONARY

AWARD ACTIVITY TYPE: SCIENTIFIC/HEALTH RESEARCH (INCLUDES SURVEYS)

Federal Websites

Department of Health & Human Services

HHS Operating Divisions

HHS Staff Divisions

Download A Document Viewer