HeartShare DeCODE-HF: Data translation center to Combine Omics, Deep phenotyping, and Electronic health records for Heart Failure subtypes and treatment targets - Title:
Extraction of SDOH elements from multisite EHR for endorsed CDE mapping and analysis:
HeartShare
Project Summary:
Heart failure with preserved ejection (HFpEF) is a highly prevalent and complex disorder that
confers a substantial burden of morbidity and mortality. In contrast with the many evidence-based
therapeutic options available for heart failure with reduced ejection fraction (HFrEF), progress for
disease-modifying therapies for HFpEF has been limited and five-year survival rates following
hospitalization have remained stagnant at approximately 50%. A major barrier in identifying
effective treatments for HFpEF (identified by the NHLBI HFpEF Working Group convened in 2020) is
the “one size fits all” approach to what is a heterogeneous syndrome that comprises many
different subtypes.1 Therefore, the primary goals of HeartShare are to classify heart failure with
preserved ejection fraction (HFpEF) into distinct phenotypes, characterize disease mechanisms,
and identify therapeutic targets for each HFpEF subtype. The study includes three overlapping
components; a prospective, observational study of patients with HFpEF and controls that begins
with an intensive in-person assessment, a low-touch longitudinal registry and a HeartShare EHR
Study. In the EHR study, a multi-center retrospective cohort of patients with heart failure is being
created to better understand the epidemiology and health care patterns of a large, diverse
population of patients with HFpEF across seven health systems.
A supplement focusing on extraction of SDOH information with mapping to established CDE in
HeartShare accomplishes numerous objectives; 1) raising the awareness of CDE in the Heart
Failure advocacy and research communities (who are highly integrated in the program) promote
collection of high quality interoperable data in future trials and studies 2) generation of high-quality
individual-level SDOH data from a large EHR dataset—which would be expected to contain more
diversity in SDOH than many trials and studies which often disproportionately enroll individuals of
higher SES and 3) integration of individual-level SDOH into numerous HeartShare datasets
facilitating numerous analyses investigating the critical role of SDOH as exposures, covariates,
and mediators for key Heart Failure outcomes.
Over the past several years, many health systems have begun to systematically collect individual-
level, health-related social risk data; the collection of this data was motivated both by internal
interest to implement programs to advance health equity as well as evolving external incentives
and requirements focused on this type of data collection. Although several groups, such as the HL7
Gravity Project and PhenX, have sought to develop consensus-driven standards for health-related
social risk data from research participants and patients, these standards have not been widely
adopted by individual health systems or EHR vendors and the result is a wide variety of data
collection methods—many different structured instruments and sometimes SDOH data captured
in EHR notes. To generate high-quality research data from this current messy reality, we will
perform a landscape analysis of the individual-level health-related social risk factor data being
collected in any structured form at each of the 7 HeartShare Clinical Centers. We will also
Inventory of SDOH elements often captured in cardiology, primary care, and social work notes at
Northwestern Medicine using open-source, natural language processing tools.2 Once the
inventories are completed, we will create publicly available algorithms to extract and clean this
data from EHR databases (taking into account the ever-present issues of repeated and often
incomplete data collection inherent in EHR platforms). Then, in partnership with NCI CDE
Repository (caDSR) and using their already developing tools, we will map these extracted elements
to endorsed NIH CDE, which may include a subset of the ScHARe CDE. As part of this process, we
will also map SDOH information collected in the HeartShare registry and deep phenotyping studies
(via Eureka) to endorsed CDE. This will give us a unique opportunity to examine correlation
between SDOH collected via research survey and EHR in a subset of HeartShare participants.
As part of the parent HeartShare study, each site has been mapping its electronic health record
data to a common data model called the Observational Medical Outcomes Partnership (OMOP),3,4
a data model used widely worldwide, including by other NIH-funded studies, such as All of Us and
eMERGE.5 The OMOP data model includes key information about all patients with HFpEF, such as
demographics, blood pressure measurements, medications, co-morbidities, diagnostic testing,
procedures, ambulatory encounters, hospitalizations, and deaths within each health system. As
part of continuing HeartShare work, sites are already adding detailed echocardiographic data and
zip codes for linkage to social determinants of health (SDOH), such as area deprivation index.6,7
Once extracted elements of SDOH are mapped to endorsed CDE, Northwestern Medicine and at
least one additional HeartShare site will place them in OMOP, facilitating analysis investigating the
critical role of SDOH as exposures, covariates, and mediators for key Heart Failure outcomes.
Data from HeartShare will be deposited into BioData Catalyst, and we will work with that team to
ensure these data, including the SDOH, are deposited in an interoperable format, such as FHIR.
Expected Outcomes:
1) Mapping of SDOH instruments used in HeartShare Registry and Deep Phenotyping (via
Eureka) to existing NIH endorsed CDE in partnership with caDSR
2) Outreach about the importance of endorsed CDE use in trials and observations studies to
the broad Heart Failure advocacy community
3) Inventory of structured SDOH questions posed at clinics throughout 7 major health
systems
4) Inventory of SDOH elements often captured in cardiology and primary care notes at a major
integrated health system using open-source, natural language processing tools
5) Publicly available algorithms to extract, consolidate, and clean SDOH questions and
elements from raw EHR databases across multiple clinical centers (in preparation for CDE
mapping) and load into OMOP
6) Mapping of SDOH elements captured via structured/unstructured EHR data to endorsed
CDE in partnership with caDSR
7) Collection of SDOH (mapped to established CDE) into existing HeartShare OMOP instance
at 2 HeartShare sites
8) Analysis of the distribution of health-related social risk factors across different HFpEF sub
phenotypes identified by the HeartShare team
9) Deposition of SDOH data from HeartShare into BioData Catalyst in FHIR format