Sunday, May 11, 2025 5/11/2025

The CFDE Workbench

Award Number: OT2OD036435
ORGANIZATION: OFFICE OF THE DIRECTOR, NIH
OPDIV: NIH
AWARD CLASS: DISCRETIONARY
AWARD ACTIVITY TYPE: OTHER

Group Awards By:

View Award Description

The CFDE Workbench - Abstract The NIH Common Fund (CF) programs have produced transformative datasets, databases, methods, bioinformatics tools and workflows that are significantly advancing biomedical research in the United States and worldwide. Currently, CF programs are mostly isolated. However, integrating data from across CF programs has the potential for synergistic discoveries. In addition, since CF programs have a time limit of 10 years, sustainability of the widely used CF digital resources after the programs expire is critical. To address these challenges, the NIH established the Common Fund Data Ecosystem (CFDE) program which has been recently approved to continue to its second new phase. For the second phase of the CFDE, this project will establish the Data Resource Center (DRC) and the Knowledge Center (KC). Our efforts will culminate in producing The CFDE Workbench which will be composed of three main products: the CFDE information portal, the CFDE data resource portal, and the CFDE knowledge portal. These three web portals will be full-stack web-based applications with a backend database and will be integrated into one public site. The CFDE information portal will be the entry point to the other two portals. It will contain information about the CFDE in a dedicated About page, information about each participating and non-participating CF program, information about each data coordination center (DCC), a link to a catalog of CF datasets, and a link to a catalog of CF tools and workflows, news, events, funding opportunities, standards and protocols, educational programs and opportunities, social media feeds, and publications. The CFDE data resource portal will contain metadata, data, workflows, and tools which are the products of the CF programs, and their data coordination centers (DDCs). We will adopt the C2M2 data model for storing information about metadata describing DCC datasets. We will also archive relatively small omics datasets that do not have a home in widely established repositories and do not require PHI protection. In addition, we will expand the cataloging to CF tools, APIs, and workflows. Importantly, we will develop a search engine that will index and present results from all these assembled digital assets. In addition, continuing the work established in the CFDE pilot phase, users of the data portal will be able to fetch identified datasets through links provided by the DCCs via the DRS protocol. This will include links to raw and processed data. The CFDE knowledge portal will provide access to CF programs processed data in various formats including: 1) knowledge graph assertions; 2) gene, drug, metabolite, and other set libraries; 3) data matrices ready for machine learning and other AI applications; 4) signatures; and 5) bipartite graphs. In addition, the extract, transform, and load (ETL) scripts to process the data into these formats will be provided. Since such processed data is relatively small, we will archive and serve this processed data, mint it with unique IDs, and serve it via APIs. In addition, we will develop workflows that will demonstrate how the processed data can be harmonized. At the same time, we will document APIs from all CF DCCs and provide example Jupyter Notebooks that demonstrate how these datasets can be accessed, processed, and combined for integrative omics analysis. For the knowledge portal we will also develop a library of tools that utilize these processed datasets. These tools will have some uniform requirements enabling a plug-and-play architecture. To achieve these goals, we will work collaboratively with the other CFDE newly established centers, the participating CFDE DCCs, the CFDE NIH team, and relevant external entities and potential consumers of these three software products. These interactions will be achieved via face-to-face meetings, virtual working groups meeting, one-on-one meetings, Slack, GitHub, project management software, and e-mail exchange. Via these interactions, we will establish standards, workstreams, feedback and mini projects towards accomplishing the goal of developing a lively and productive Common Fund Data Ecosystem.


Issue Date FY	Funding FY	Legal Entity Name	Legal Entity Address	Legal Entity City	Legal Entity State	Legal Entity Zip Code	Legal Entity COUNTY	Legal Entity COUNTRY	Assistance Listing	Award Code	Budget Year	Action Date	Action Type	Action Amount

Issue Date FY: 2024 ( Subtotal = $1,750,000 )
2024	2024	ICAHN SCHOOL OF MEDICINE AT MOUNT SINAI	1 GUSTAVE L LEVY PL	NEW YORK	NY	10029	NEW YORK	USA	Trans-NIH Research Support	002	1	8/20/2024	SUPPLEMENT FOR EXPANSION	$1,750,000
2024	2023	ICAHN SCHOOL OF MEDICINE AT MOUNT SINAI	1 GUSTAVE L LEVY PL	NEW YORK	NY	10029	NEW YORK	USA	Trans-NIH Research Support	001	1	8/19/2024	NEW	$0
2024	2023	ICAHN SCHOOL OF MEDICINE AT MOUNT SINAI	ONE GUSTAVE L LEVY PL	NEW YORK	NY	10029	NEW YORK	USA	Trans-NIH Research Support	000	1	10/27/2023	NEW	$0
														Subtotal = $1,750,000

Issue Date FY: 2023 ( Subtotal = $1,500,000 )
2023	2023	ICAHN SCHOOL OF MEDICINE AT MOUNT SINAI	ONE GUSTAVE L LEVY PL	NEW YORK	NY	10029	NEW YORK	USA	Trans-NIH Research Support	000	1	9/18/2023	NEW	$1,500,000
														Subtotal = $1,500,000

Grand Total All Awards = $3,250,000

Top

All Categories

About

Search

Reports

Data Submission

Award Information

The CFDE Workbench

Award Number: OT2OD036435

ORGANIZATION: OFFICE OF THE DIRECTOR, NIH

OPDIV: NIH

AWARD CLASS: DISCRETIONARY

AWARD ACTIVITY TYPE: OTHER

Federal Websites

Department of Health & Human Services

HHS Operating Divisions

HHS Staff Divisions

Download A Document Viewer