PROJECT SUMMARY
Bioconductor is a project dedicated to the analysis and interpretation of high throughput genomic data, includ-
ing sequencing, microarray, ¿ow cytometry, proteomics, and imaging data. Bioconductor is based on the R sta-
tistical programming language. It consists of software, annotation, and data packages developed and contributed
by individuals funded by this grant, and by the national and international scienti¿c community. Bioconductor is
highly respected, widely used in the global bioinformatics community, highly cited, and has formal collaborative
alignments with the Human Cell Atlas and NHGRI's Genomic Analysis, Visualization, and Informatics Lab Space
(AnVIL). Work proposed in this renewal application re¿ects the commitment of the project to open-source/open-
development creation and distribution of portable tools for genomic data science, high-quality documentation and
support for users and developers, adaptation of computational methods to new technologies for cloud-scale data
science, and effective training of the workforce for genome biology and personalized genomic medicine. The spe-
ci¿c aims are (1) maintenance and enhancement of the system at bioconductor.org for organizing and distributing
analytic software, reference data, and curated experimental data, (2) hardening of core infrastructural software
packages to increase reliability and throughput of analyses based on the system, (3) conduct research and devel-
opment of best practices for taking advantage of scalable computing strategies for integrative cloud-scale genomic
analysis, and (4) enhance community engagement and education practices that have been intrinsic to the project
since its inception. By pursuing these aims, project investigators and contributors add to the usability, relevance,
and robustness of a system and community that is unique and is uniquely situated to accelerate progress in many
areas of genomic data science, ultimately contributing to biological knowledge and improvement of human health.