PROJECT SUMMARY/ABSTRACT
UniProt is one of the foundational resources for understanding the molecular details of the inner life of the cell.
As a hub for knowledge of protein sequence and function, UniProt organizes knowledge from biomedical
literature, integrates datasets with curated knowledge, and serves as an exemplar FAIR and TRUST resource
reused by hundreds of other data resources. The Common Fund (CF) program brings together diverse
collections of valuable biomedical data, yet still needs to be connected with the major data repositories
ecosystem to uncover its full potential for discovery and innovation. In this application we will strengthen the
protein-centric connectivity of UniProt to multiple CF Data Coordinating Centers (DCCs) and the Data Resource
Center (DRC) in the Common Fund Data Ecosystem (CFDE), and will collaborate with two CF DCCs, the
Knockout Mouse Phenotyping Program (KOMP2) and the Cell Maps for AI (CM4AI) Project of the Bridge to
Artificial Intelligence (Bridge2AI) Program. The collaboration will support CFDE data integration and reuse, while
building new capabilities to interrogate and understand complex biological systems and cell networks for deeper
understanding of responses to perturbations at the systems level. To foster biomedical discovery through the re-
use of CF data and enable novel scientific research that was not possible before, we will increase the connectivity
and data integration between UniProt and CF datasets. We will create a new form-based mechanism to link with
CF data resources, starting with several already identified for targeted integration. We will extend the UniProt
mapping service, allowing researchers to identify data of interest through protein-centric search and promote
interoperability at the protein level across the CF ecosystem. We will develop new tools, APIs and workflows
with easy access and navigation for researchers to interrogate and understand complex biological systems and
cell networks. It will allow seamless bidirectional navigation of data from whole subcellular systems to functional
pathways and interacting proteins and vice versa. We will hold annual workshops to broaden collaborations with
the CFDE and engagement with the research community. We will incorporate user feedback throughout the tool
development lifecycle and implement an integration plan with the CFDE for dissemination and long-term
sustainability. We aim to understand systems-level responses to perturbations through genotype-phenotype
mapping and metabolic and signaling network discovery. We will integrate relevant datasets, tools, and literature
and extend our knowledge graph learning algorithm for link prediction in a drug discovery use case. We will build
customized knowledge graphs with various use scenarios, incorporating scientific questions from the
demonstration project and user engagement. This UniProt-CF partnership will support functional genomics
towards fundamental biological systems understanding and the development of new diagnostics and therapies.