Project Summary/Abstract
We seek supplemental support to the core operating funding for the Reactome
Knowledgebase of human biological pathways and processes. Reactome is a curated
knowledgebase available online as an open access resource that can be freely used and
redistributed by all members of the biological and biomedical research communities. It is used
by clinicians, genomics researchers, and molecular biologists to interpret the results of high-
throughput experimental studies, by bioinformaticians seeking to develop novel algorithms for
mining knowledge from genomic studies, and by systems biologists building predictive models
of normal and disease variant pathways. Our curators, Ph.D.-level scientists, work closely with
independent investigators within the community to assemble machine-readable descriptions of
human biological pathways. Pathways are checked and peer-reviewed prior to publication to
ensure its factual accuracy and compliance with the data model. A system of evidence tracking
ensures that the primary literature supports all assertions. Reactome uses community-standard
controlled vocabularies and ontologies to increase interoperability across resources. Pathways
are reviewed and updated regularly. Reactome pathways are available on our website for
browsing, downloading, and are accessible to in-house and third-party analysis tools. The
project is highly cited in the literature, has been used repeatedly to make significant biological
and clinical discoveries, and is incorporated into many high-impact informatics tools and
resources.
Over the past two decades, Reactome has developed a sophisticated software
ecosystem. This ecosystem comprises various components for curation, quality
assurance/quality control, data analysis and visualization, release, and export. Our primary
focus in the parent Reactome grant is to upgrade the old GWT-based web application to a
modern Angular-based app, port the standalone Java-based curator tool to the web, and
migrate our internal curator database from MySQL to Neo4j. With the backing of this
supplemental grant, we aim to introduce modern Continuous Integration/Continuous
Deployment (CI/CD) technologies into our software development process. This will allow us to
update existing components more efficiently and integrate new ones seamlessly. We also plan
to adopt LinkML, a modern tool for editing knowledge graph schemas, to manage our data
model. We will replace the outdated multi-step data model update protocol, which relies on Perl
and Protege and requires tedious manual editing across multiple components. With the help of
this supplemental grant, we will make our software development more streamlined and efficient.