The main goal of the Knowledge Management Center (KMC) for the Illuminating the Druggable Genome (IDG)
program is to aggregate, update and articulate protein-centric data, information and knowledge for the entire
human proteome with emphasis on understudied proteins from the 3 families that are the focus of the IDG
(“IDG List”). The long-term objective of the KMC is to encourage and support biomedical research aimed at
understudied proteins by providing an extensive resource of data, information, knowledge, methods and
reagents for the entire human proteome, and to support the growing online community focused on
understudied proteins. With focus on the IDG List and human proteins, the KMC will enable support for
expanded coverage for non-human proteins of therapeutic interest and other associated human health data, in
order to catalyze novel biomedical discoveries. To support the overall IDG objective, and to maintain, update
and improve these integrated resources, the KMC draws upon expertise from multiple knowledge domains,
specifically biology, chemistry and medicine, as well as computer science, graphic design and web
programming. Specifically, for the Phase 2 of the IDG KMC we propose 4 Aims:1. Create an automated
workflow that captures relevant public data for the entire proteome and manual annotations for the IDG list.
The KMC knowledge management system will be built around knowledge graphs, focused on five major
branches of the target knowledge tree, tkt: Genotype, Phenotype, Expression, Structure & Function, and
Interactions & Pathways, respectively. Aim 2: Design, develop and implement a protein knowledgebase with
Data Analytics support. Our protein-centric biomedical knowledge base, TCKB (Target Central
Knowledgebase) will be comprised of the data, knowledge and information container, together with its
codebase and software pipelines. TCKB will be the repository for experimental, processed and computed data
and reagents originating from the IDG DRGCs (Data and Resource Generation Centers). We will provide
informatics and modeling support for DRGC activities. Aim 3: We will expand, improve and maintain Pharos.
Particularly “knowledge packages,” support automated data summaries for Protein Dossiers, and actively seek
feedback from our community. Aim 4. Outreach to scientific community. We will support a series of activities
that will leverage TCKB, Pharos and other IDG resources to increase adoption of IDG work, while observing
FAIR (findable, accessible, interoperable, reusable) principles for our knowledgebase, portal and pipelines.
The KMC will engage in community outreach by leading tutorials and feedback sessions and dissemination of
the Pharos system. To meet its goals, the KMC will coordinate all core activities in close coordination with the
IDG Steering Committee and IDG Project Scientists (PS), and include members of the IDG Consortium (IDG-
C), other NIH Common Fund programs, NIH Commons, as well as other initiatives.