Overall: Our project combines the significant advantages of a genetic model organism, sophisticated pathway
mapping tools, high-throughput and accurate quantum chemistry (QM), and state-of-the-art experimental
measurements. The result will be an efficient and cost-effective approach for unknown compound identification
in metabolomics, which is one of the major limitations facing this growing field of medical science.
Caenorhabditis elegans has several advantages for this study, including over 10,000 available genetic
mutants, well-developed CRISPR/Cas9 technology, and a panel of over 500 wild C. elegans isolates with
complete genomes. Half of C. elegans genes have homologs to human disease genes, making this model
organism an outstanding choice to improve our understanding of metabolic pathways in human disease. We
will develop an automated pipeline for sample preparation to reproducibly measure tens of thousands of
unknown features by UHPLC-MS/MS. We will use the wild isolates to conduct metabolome-wide genetic
association studies (m-GWAS), and SEM-path to locate unknowns in pathways using partial correlations. The
relevance of the unknown metabolites to specific pathways will be tested by measuring UHPLC-MS/MS data
from genetic mutants of those pathways. Molecular formula and pathway information will be the inputs for
automated quantum mechanical calculations of all possible structures, which will be used to accurately
calculate NMR chemical shifts that will be matched to experimental data. The correct structures will be
validated by comparing them with 2D NMR data of the same compound. The validated computed structures
will then be used to improve QM-based MS/MS fragment prediction, using the experimental UHPLC-MS/MS
This project will enhance many areas of science beyond worms and model organisms. First, C. elegans is the
simplest animal model available with significant homology to other animals and humans. The discoveries we
make in metabolic pathways will have a direct impact on studies of several human diseases. Second, our
approach is highly transferable to other genetic systems and with little modification can be applied to many
other applications. Perhaps most important is the relevance to large-scale human precision medicine studies.
The wild C. elegans isolates are “individuals” with diverse genomes that are a model for natural populations
such as humans. It is true that we are using mutant animals that would not be available in a human precision
medicine study, but the mutants are used primarily to validate pathways that are constructed entirely by wild
isolate data. Once the approaches are fully developed and validated, the mutants will not be necessary. C.
elegans and other genetic model organisms were instrumental in the development of modern genomics and
DNA sequencing technologies. Our premise is that the worm will have a comparable impact in metabolomics.