Development of a metabolomics and machine learning based high-throughput screening platform for data-driven drug discovery - Project Summary
High-throughput omics technologies allow for measuring various biomolecules comprehensively and over the
past decade have become exponentially less expensive. Coupling these emerging technologies with automation
approaches and the phenotypic-based drug discovery paradigm allows for data-driven drug discovery (D4). D4
focuses on a complete cellular readout, quantitatively measuring 100s to 100,000s of biomolecules or cellular
features, rather than focusing on a single protein, pathway, or physiological trait. The complexity of this data
requires computational tools for proper analysis and interpretation. In Phase I of this proposal, we combined the
dual strengths of experts in LC-MS/MS based metabolomics (Omix Technologies) with leaders in metabolomics
data analysis (Sinopia Biosciences) to develop a metabolomics based high-throughput screening platform. We
screened ~250 FDA approved small molecules from a broad range of drug classes on two cell lines. This dataset
was compared to a matching dataset from the pioneering project for D4, the Connectivity Map, which is a
transcriptomics screening and query platform for drug characterization, discovery, and repositioning. In Phase I,
we observed that from both a technical and biological utility standpoint, the metabolomics data provided an
orthogonal dataset with signal fidelity, sensitivity, and relevance to compound properties comparable to or
exceeding the Connectivity Map. Further, we saw high concordance of plasma metabolite changes in type 2
diabetes and rheumatoid arthritis patients with in vitro metabolite changes of related drugs used for those
indications. Thus, these results suggest that a metabolomics based high-throughput screening platform is not
only viable as a complementary dataset to the Connectivity Map, but that metabolomics data can even play a
primary role in drug discovery. In this Phase II proposal, we will focus on profiling chemical and genetic
perturbations in vitro to further demonstrate the power of the platform and identify commercial opportunities for
treating genetically defined rare diseases. We will expand data generation to ~3300 bioactive compounds across
three cell lines. Further, we will profile 50 genetic knockouts on those three cell lines to model in vitro the
associated rare diseases. Using Sinopia’s platform, we will select compounds for follow-up evaluation to identify
candidates that correct for metabolic dysregulations seen in those rare diseases. Successful in vitro programs
will aid in seeding of an early stage discovery pipeline that will be advanced through funding by private
investment, patient advocacy groups, and additional federal grants.