PROJECT SUMMARY
This project will develop the Natural Products Magnetic Resonance Database (NP-MRD), the central repository
for all NMR data generated by the natural products community. The core of the NP-MRD will be an open-access,
web-enabled, community-focused, FAIR-compliant database containing NMR spectra and structures for all
known natural products (estimated to be ~350,000 structures). It will contain (i) legacy (curator-backfilled) NMR
data of NPs derived from the literature, existing public databases, and “private” data archives, (ii) new NMR data
submitted by depositors for novel NPs, and (iii) heuristically calculated NMR chemical shifts for all NPs and,
eventually, density functional theory (DFT) calculated chemical shifts all NPs. Data deposition will be both rapid
(<5 minutes) and simple. The NP-MRD will be closely integrated (through data exchange agreements) with
“sister” databases containing MS, biosynthetic gene cluster, and bioactivity data. It will provide rigorous validation
and data checking (QA/QC) to ensure that submitted assignment data is of the highest quality. Validation and
analytical summary reports will be provided following data deposition. The NP-MRD will also offer powerful
database search, filtering, and querying tools to facilitate spectral, structure and taxonomic searches or
selections. In addition to data storage, retrieval, and curation, the NP-MRD will host an extensive suite of software
tools for NP research. These will include tools for spectral dereplication, structure validation, and NMR-based
profiling of complex mixtures. The NP-MRD will also provide tools for spectral and structural visualization and
comparison, as well as chemical-space network visualization and chemo-taxonomic comparisons. Additionally
the NP-MRD will offer tools for NMR spectral prediction and simulation. A core principle of the NP-MRD is that
availability of high-quality, value-added reporting and interactive tools will encourage user engagement and data
deposition. Software produced and hosted by the NP-MRD will be open-source and open-access. To enhance
interoperability, portions of the NP-MRD and associated software will be “dockerized” on cloud computing
resources and, if needed, converted to web-based APIs. The NP-MRD’s deposition tools will be designed to
work with participating journals’ paper-submission software. The NP-MRD team will engage thought leaders,
journal editors, and database managers to develop consensus protocols regarding data deposition, data format
standards, and data exchange—all compliant with community-established policies and standards. Our proposed
work has already gained traction, as shown by many strong letters of support from key stakeholders. The NP-
MRD will offer online and on-site (at conferences) training on use of its software for NP dereplication and
identification. Resources covering tips, tricks, and techniques in NMR spectroscopy and NMR software will be
made available on the NP-MRD website and through web-based “office hours” to help users, depositors, and
the general NP community. NP-MRD’s long-term goal is to advance and enable natural products research by
archiving and adding real value to natural product NMR data.