Summary
With the striking success of genome sequencing in the past four decades, biological and biomedical data are
accumulating exponentially. Efficiently curating and analyzing these data in a manner that maximizes their value
and accessibility is critical for realizing the scientific advances and social benefits genome sequencing will
enable. Yet the scale of databases has become increasingly difficult to process using on-hand database
management tools and traditional processing applications, creating a continuing demand for innovative
approaches. Investigators at the University of Michigan (UM) have recently developed a number of biomedical
methods and databases to help facilitate protein folding and drug discovery, gene mutation and human disease
diagenesis, cardiovascular disease and surgery treatment, complex human disease and human health-driven
medicine. Some of the methods have been recognized as the world’s best and widely used in the biological and
medical communities. However, limited computing resources critically constrain the scope and scale of
developments, as well as broad application, of these studies. In this proposal, we propose to acquire a new,
hybrid high-performance computing (HPC) cluster with multiple CPU and GPU nodes, to serve the computational
need of a group of 24 UM biomedical research laboratories. Due to the nature of the biomedical studies which
involve large-scale and dynamic genomic databases, the simulation work often has special requirements in
memory, storage and backup, input/output (I/O) setting, network speed, and internet connections, which make it
difficult – and sometimes infeasible – to implement by the university-wide and public computing resources. To
address these issues, new database and library allocation strategies are being developed and integrated with
high-speed NVMe flash drives and EDR InfiniBand networks to improve the performance for the data-intensive
biomedical studies. The new HPC cluster will reside in the Michigan Academic Computing Center (MACC)
associated with the campus-wide Great Lakes cluster system and be managed by the professional team of the
Advanced Research Computing Technology Services (ARC-TS) who have over 20 years of operational
experience supporting HPC environments. Substantial infrastructure and technical support will also be provided
by the ARC-TS and the Department of Computational Medicine and Bioinformatics (DCM&B), which will
significantly reduce startup costs and improve the operational efficiency of the proposed HPC system. Overall,
the acquisition of this equipment will immediately benefit 39 independent NIH-funded research projects led by
the HPC User team, as well as 31 other NIH projects in which the Users participate. All bioinformatics and
biomedical software tools and databases developed on this infrastructure will be made publicly and freely
available to academic individuals and institutions. In this manner, the impact of the proposed computer resource
will be multiplied and significantly enhanced by the product dissemination through the broader biomedical
communities outside the University of Michigan.