Flexible Hybrid Cloud Infrastructure for Seamless Integration and Use of Human Biomolecular Data and Reference Maps [1 of 5] - The Human BioMolecular Atlas Program (HuBMAP) is redefining our understanding of the human
body by recovering multi-scale tissue organization -- anatomical, histological, and molecular -- at
unprecedented resolution, through computational integration of diverse experimental
measurements. The HuBMAP Integration, Visualization & Engagement (HIVE) Collaboratory is
an effort among interdisciplinary components developing pipelines for data ingestion and
processing, enabling visualization of datasets spanning dozens of biomolecular assays on the
HuBMAP portal, leading the development of a human common coordinate framework (CCF),
constructing molecularly and spatially resolved reference maps of human tissues, developing
mapping frameworks for the interpretation of new datasets, and coordinating extensive
collaborative activities both within HuBMAP and with the broader community. In the production
phase of HuBMAP, the HIVE will construct a Human Reference Atlas (HRA), establishing the
HuBMAP Portal as the “go-to” resource for human tissue reference maps and multimodal singlecell data. The next iteration of the HIVE will coalesce the HuBMAP Consortium around a joint
vision, develop cutting-edge and scalable tools to achieve it, and ensure its open dissemination
to partners and users across the wider international community.
As the HIVE Infrastructure Component (IC), the Pittsburgh Supercomputing Center (PSC), the
University of Pittsburgh (Pitt), and Stanford University will provide infrastructure, based on our
flexible hybrid cloud microservices architecture, along with community engagement, that will
support delivery of this vision in the production phase. To accomplish this, we will focus our efforts
in the following key areas: 1) Curation and Ingestion: Increased automation of data ingestion from
HuBMAP data providers, community partners, and the general research community to maximize
efficiency and usefulness for building the HRA; 2) Integration: Automated integration and mapping
of ingested data to the HRA based on data standards; 3) Findability and Accessibility:
Manifestation of backend resources in the modular architecture of APIs and containers, services,
and documentation that minimize user friction in integrated searching, querying,
analyzing/aligning and viewing of tissue maps at multiple spatial scales and among multiple layers
of information; 4) Interoperability: Extension of the HuBMAP Knowledge Graph to translate
HuBMAP data, HRA assets, and community data among one another via ontologies; 5) Analysis:
Infrastructure support to maximally enable users with scalable analyses and workflows among
both HuBMAP and user-contributed data and tools, including integration and mapping against the
HRA; and 6) Sustainability: Sustainment of open tools, data, and infrastructure for reuse beyond
the production phase. We will grow and harden our model for collaboration, coordination, and
engagement led by the IC, with substantial leadership from all HIVE members and participation
from all HuBMAP Members.