Artificial Intelligence Toolkit for Predicting Mixture Toxicity - Chemical safety assessment is typically conducted for individual chemicals. However, industrial chemicals rarely
act in isolation to produce adverse effects, so mixture toxicity assessment represents a complex but more
realistic approach to alleviating environmental chemical safety concerns. There is an exciting and highly
impactful challenge to develop innovative approaches employing modern AI algorithms to provide accurate
toxicity prediction of mixtures from their chemical composition, including the assessment of synergistic effects.
We recently formed Predictive, LLC, to enable the development and distribution of commercial and regulatory
strength models to predict important toxicity endpoints. In this Phase I STTR application, we propose to establish
a novel web based PreMixT (Predictor of Mixture Toxicity) toolkit built on best practices for (i) data collection,
cleaning, harmonization, and integration, (ii) model development using current and emerging AI approaches and
thoughtful strategies of prospective validation of mixture models, and (iii) prediction of specific endpoint toxicities
for both pure chemicals and mixtures. We will achieve this objective by focusing on the following Specific Aims.
Specific Aim 1: Collect, curate, and integrate the largest publicly available mixture toxicity datasets. We
will explore all the publicly accessible data on mixture toxicity. Initial datasets will include acute oral toxicity, acute
inhalation toxicity, acute dermal toxicity, skin sensitization, skin irritation and corrosion, and eye irritation and
corrosion endpoints (collectively known as "6-pack") as well as pesticides. We will also collect and curate
datasets of untested chemicals and mixtures of the environmental concern with known composition such as High
Production Volume (HPV) chemicals and registered substances in the REACH database. The data will be
(re)structured, harmonized, and prepared for cheminformatics analysis following custom procedures. Specific
Aim 2: Develop AI Models of mixture toxicity. Using data prepared in Aim 1, we will develop rigorously
validated models of several selected endpoint mixture toxicities of relevance to environmental health risk
assessment. We will employ two types of mixture-specific descriptors: Simplex Representation of Molecular
Structure (SiRMS) and mixture graph convolution descriptors. Modeling approaches will include both common
(e.g., Random Forest) as well as innovative Graph Convolutional Networks (GCN) approaches. Specific Aim 3.
Develop the PreMixT toolkit and portal supporting the toxicity prediction of chemicals and their mixtures.
We will integrate curated data and validated models into the PreMixT web application. This PreMixT server will
be able to predict mixture toxicity, including possible synergy of mixture components, based on the knowledge
of chemicals found and characterized in the mixture. Successful completion of our Phase I studies will result in
the development of the PreMixT web application as a centralized resource to evaluate mixture toxicity,
including the synergy between mixture components.