PROJECT SUMMARY
Cancer sequencing projects have identified a very large number of DNA mutations whose importance in cancer
is not yet understood. To better understand the impact of these mutations, our team has produced a software
tool for computational analysis of cancer mutations that can analyze millions of mutations at one time. This tool
works as a funnel to help researchers to find the small number of mutations that are most likely to be informative
from the very large number of mutations discovered in a sequencing project. The software allows users to design
ways to combine multiple mutation evaluation metrics, and generate a prioritized list of mutations that are more
likely to be biologically important. These evaluation metrics include the molecular consequence, bioinformatic
scores to identify pathogenic and driver mutations, frequency of the mutation in human populations, previous
occurrence in tumor tissue types, pointers to literature, and visualization of annotated protein structures and
networks. A web-based version of the pipeline - Cancer Related Analysis of Variants Toolkit (CRAVAT) has been
widely adopted (3000+ jobs submitted/month on average in 2020). We have attracted a user community that
spans both basic and clinical cancer researchers, all of whom rely on high-throughput tumor sequencing in their
work. In 2019, we introduced OpenCRAVAT, which is distinguished by an open source codebase and an open
app store of tools and resources that can be used to better understand the importance and impact of mutations.
The app store is driven by the user community; new apps are prioritized based upon user requests and the app
store includes many apps that were contributed directly by outside tool developers. The app store currently
aggregates tools from over 70 organizations, and these tools can be combined to identify mutations whose
molecular impact contributes to tumorigenesis, prognosis and treatment selection. Initial adoption of our
OpenCRAVAT tool is encouraging, with over 10,000 local package downloads in the first two years. We expect
that OpenCRAVAT will be adopted by a much larger community, given the increasing importance of DNA
sequencing data in cancer research. We will continue to ensure that our tools are interoperable with other
informatics tools and services, and can be run in different computational environments such as cloud computing
and local installation to maintain data privacy.