PROJECT SUMMARY/ABSTRACT
The availability of high-throughput, low-cost sequencing has transformed the landscape of biomedical research
by dramatically expanding our capacity to interrogate the sequence of the human genome. Consequently, there
has been an explosion of biomedical literature describing the role of specific genomic variants and their impact
on human diseases. These advances are bringing sequencing into the clinic to shape clinical practice from the
patient’s genomic content, a paradigm colloquially referred to as genomic or precision medicine. There remain
many obstacles to fully realizing our potential in the era of precision medicine. Among them is a recognized
need for robust, well-engineered systems that provide knowledge about genomic variants and their role in
disease. Ideally, such systems would provide a comprehensive summary of all knowledge that is relevant to the
patient’s unique genomic content.
An early bottleneck to realizing precision medicine was that, despite the substantial literature and several
established knowledgebases that define interactions between drugs and genes, querying across them was
extremely challenging. In response to this need, the Drug-Gene Interaction database (DGIdb, dgidb.org) was
developed. Through a combination of automated processing and manual curation, drug-gene interaction
information was collected, structured, and connected (normalized) from these diverse sources of data and
entered into a database with a user-friendly search interface and an application programming interface (API).
However, linking drug and drug-gene interaction concepts across resources remains an extremely challenging
task, and aggregated drug-gene interactions are also challenging to represent in a way that highlights the utility
of the collected knowledge for precision medicine efforts. This proposal seeks to improve our ability to normalize
and interpret drug-gene interactions corresponding to patient genomic variants.
We will achieve this goal through two specific aims. First, the DGIdb normalization routines will be improved
through incorporation of new content and features. Among these, the DGIdb will support collections of drugs,
including combination therapies and drug classes. Also, the DGIdb will have new community submission and
curation features, allowing users to incorporate new knowledge into the database. Second, the Variant
Interpretation Aggregator database (VIAdb) will be created to normalize knowledge across several disparate
sources focused on the clinical interpretations of genomic variants. The VIAdb will operate as a stand-alone web
tool and API and will behave as a source of relevant interpretations to DGIdb. Finally, we will develop techniques
for automated identification of drug-gene interactions and variant interpretation consensus to assist community
curation efforts. If successful, this research will improve breadth and consistency of variant interpretations and
drug-gene interactions for precision medicine efforts.