Machine Learning Based Differential Mobility Spectrometry Library Development - Project Summary The goal of the project proposed is to develop a gas chromatography and differential mobility spectrometry (GC/DMS) molecular identification library for volatile organic compounds (VOCs) using a deep neural network approach. Vox Biomedical scientists will test the hypothesis that a novel, multi-task neural network architecture can predict characteristics of an previously unseen analyte from its GC/DMS spectrum. Vox Biomedical is in the process of commercializing the GC/DMS based microAnalyzer instrument, developed at DRAPER, for detecting the presence of psychoactive drugs and disease through exhaled breath analysis. While drug detection consists of measuring the concentrations in the exhaled breath of compounds whose identity is well known (such as psychoactive opioids and cannabinoids), exhaled breath disease detection is focused on characterization of a particular disease’s exhaled volatile organic compound signature. Volatile organic compounds (VOCs) are byproducts of cellular metabolism that travel from cells throughout the body to the lungs, where they are efficiently exhaled in the breath. VOCs have become of interest as biomarkers of metabolic diseases such as cancer, kidney disease and diabetes. The current generation of exhaled breath VOC based disease detection methods rely on gas chromatography and mass spectrometry (GC/MS), which, while highly sensitive, is a complex analytical modality that is expensive, slow, and must be operated by skilled professionals. The microAnalyzer instrument’s inherent portability, ease of use, and ability to obtain results at the point-of- measurements make it an ideal instrument for breath-based disease detection. However, the currently a GC/DMS peak can only be identified through characterization of a chemical standard or by performing confirmatory GC/MS analysis using a similar sample. This makes biomarker discovery a resource and time intensive process. The creation of a VOC chemical identity library, as would result from successful completion of the proposed project, will allow the identity of samples introduced to the microAnalyzer instrument to be predicted without the need for confirmatory standard characterization or GC/MS work. This will make biomarker discovery for disease a less resource intensive process expediting the discovery and confirmation of biomarkers for early-stage disease detection, ultimately saving lives.