Graphical Processing Units and a Large-Memory Compute Node for Applications in Genomics, Neuroscience, and Structural Biology - Project Summary
Cold Spring Harbor Laboratory (CSHL) is a private, not-for-profit institution dedicated to research and
education in biology, with leading research programs in genomics, neuroscience, quantitative biology, plant
biology, and cancer. Many activities at CSHL depend critically on high-performance computing resources, but at
present, investigators have limited access to Graphics Processing Units (GPUs) and large-memory compute
nodes. This deficiency is beginning to hamper a wide variety of biomedical research activities, particularly in the
key areas of genomics, neuroscience and structural biology, where such specialty hardware is becoming
essential for many important computational analyses. Here, we propose to acquire four state-of-the-art GPU
nodes, each equipped with eight Nvidia Tesla V100, SXM2, 32GB GPUs, two 20-core 2.5 GHz Intel Xeon-Gold
6248 (Cascade Lake) processors, and 768 GB of RAM. A second-generation Nvidia NVLink will provide for 300
GB/s inter-GPU communication. In addition, we propose to acquire one large-memory node with 3 TB of RAM
and four 20-core 2.5 GHz Intel Xeon-Gold 6248 (Cascade Lake) processors, as well as a top-of-rack 10 Gb
Ethernet switch to interconnect the servers with each other and with our existing computer cluster. These new
resources will enable a wide variety of innovative research across fields, with direct implications for human
health. In genomics, applications will include RNA-seq read mapping; alignment, base-calling, and genome
assembly for long-read sequence data; clustering of single cell RNA-seq data; analysis of transposable
elements; deep-learning methods for prediction of the fitness consequences of mutations; and deep-learning
methods for interpreting high-throughput mutagenesis experiments. In neuroscience, they will include analysis
of multi-neuron activity recordings; analysis of mouse brain images; and artificial neural network models of the
human olfactory system, of audio features, and of behavior as a function of changing motivations. In structural
biology, they will include image processing and 3D reconstruction from cryo-electron microscopy data. These
new compute nodes will have a primary impact on the research programs of nine major users from the CSHL
faculty with substantial NIH funding. They will also impact three minor users. The new GPU and large-memory
nodes will be fully integrated with a soon-to-be-upgraded high-performance computer cluster and managed by
the experienced Information Technology group at CSHL, with oversight from a committee of seven faculty
members and two IT staff members. Altogether, these new computational resources will substantially enhance
the overall computational infrastructure at CSHL.