PROJECT SUMMARY
Interpretating large transcriptomic and proteomic datasets is challenging, particularly for researchers
lacking computational training. Enrichment analysis bridges gene-level changes to biological pathways,
thereby unraveling the complexity of biological systems. Even though dozens of tools are available,
there is still a critical barrier for bench scientists to conduct enrichment analysis interactively, rigorously,
and reproducibly. To complement existing tools, we developed a prototype for ShinyGO, a web-based
tool for overrepresentation analysis using lists of genes or proteins.
ShinyGO supports gene ontology (GO) enrichment analysis for over 14,000 organisms, making it
versatile and widely applicable. Its database also encompasses KEGG, Reactome, protein-protein
interaction (PPI) network, transcription factor target genes, and microRNA target genes, among others.
With its dynamic visualization capabilities, ShinyGO depicts enrichment results and unique gene
features like gene length and genomic distribution, enabling a multi-faceted investigation of gene
regulation. Due to its user-friendly design, robust functionalities, and regularly updated databases,
ShinyGO has become an indispensable resource for many researchers, evident by more than a
thousand citations in the last three years.
As a side project, however, ShinyGO has not been properly developed, debugged, or documented. Its
extensive database was gathered and maintained with NIH funding for another bioinformatics tool, iDEP,
which expires in 2024. Our goal for this project is to fully develop ShinyGO, enhance its functionality,
improve reproducibility, and broaden the database. Aim 1 focuses on refactoring using a modular
framework that makes it available as an R package. We will add a host of new features, including the
analysis of multiple gene sets, the generation of R code that can reproduce analyses, provenance
monitoring, knowledge graph-based visualization, additional analytical functions, and a ChatGPT-based
chatbot for user support. Aim 2 mainly involves testing, documentation, and user outreach. Aim 3 is
dedicated to updating and expanding the massive database, which will be made openly accessible.
This project will transform the ShinyGO prototype into a rigorous, robust, and widely applicable tool. It
will enhance our capacity to interpret 'omics' data accurately and effortlessly, further facilitating
discoveries in various biological and medical research fields. This work will have a substantial impact
on thousands of researchers, especially those with limited resources.