ABSTRACT
The UCSC Genome Browser and associated tools are used by hundreds of thousands of biomedical
researchers including clinical geneticists, bioinformaticians, researchers working with model organisms, and
wet lab scientists researching human physiology at the molecular level in both healthy and disease states. The
browser integrates the results of thousands of biomedical labs – including a wide range of biochemical assays,
genetic studies, curations, sequencing projects, and computer analyses into a series of tracks aligned to the
underlying genomic sequence. The genome provides a natural integration framework for these diverse data
sources, which the browser showcases at a variety of display scales ranging from the single base to individual
genes, entire chromosomes, and ultimately to the genome as a whole.
The Genome Browser is implemented using robust, fast, high-quality software capable of handling over one
million hits per day. This web software provides a window into an exceptionally detailed and well-documented
database that can be queried computationally as well as browsed graphically. The database is loaded with a
suite of programs, developed both at UCSC and elsewhere, capable of distilling huge genomics data sets into
high-quality annotations of the genome. Significant engineering effort is invested to ensure the quality of the
software and data sets, including those developed by external contributors. The system is designed to make it
easy for users to view their own, unpublished, data sets alongside those that we have fully curated and
integrated. Consortia and other resources can make their data visible in our browser via “track hubs.”
We plan to extend our resource in significant ways. We will help make genomics more equitable to currently
underserved populations by moving to a more inclusive “pangenome” reference that includes sequences that
represent the greater genomic diversity of humanity, not just samples of convenience from largely European
populations. We will enable visualization of individual genomes, not just a single haploid reference genome.
We will address the opportunities and challenges of new technologies such as single-cell RNA sequencing and
single-molecule long-read DNA sequencing. We will collaborate with others in the increasingly complex
ecosystem of biomedical consortia and resources, and will integrate their results into the Genome Browser,
and also, through our APIs and our helpful staff, ensure that others can make the best use of data available in
their efforts. We will provide tools and data for medical users to understand the significance of sequence
variants in the patients they care for and will help characterize regions of greater genomic complexity and
medical importance. We will extend our outreach effort to include more online content to help engage a new
generation of users.