The use of nanomaterials, especially Engineered Nanomaterials (ENMs), in consumer
products and medicine has been skyrocketing over the past decade. Various in vitro and in vivo
studies evaluating the potential environmental and health effects of ENMs have generated vast
quantities of experimental data, requiring urgent curation for information extraction,
analysis/modeling, and data/model sharing using artificial intelligence methods. Computational
modeling methods, especially machine learning and deep learning approaches, bear high
expectations to develop predictive models for ENMs based on the available
property/activity/toxicity data. Currently ENMs databases do not consist of nanostructure
annotations to store diverse structural information in machine readable formats that are critical
for computational modeling studies. To address this challenge in the current big data era, we will
develop a large, publicly available ENMs portal that contains annotated nanostructures of more
than 3,000 ENMs suitable for the computational modeling research, which will lead to the rational
nanomedicine design. The ongoing Nanotechnology Health Implication Research (NHIR)
consortium is providing high quality ENMs data for the initial ENMs database of this portal and
will also support future data curations. This database will be designed based on Virtual
Nanostructure Simulation (VINAS) technique, which will annotate the complex nanostructures
into machine readable formats that are suitable for the machine learning modeling purpose. To
this end, we will develop various new computational approaches to annotate the nanostructures,
especially for complex ENMs (e.g. graphene derivatives). After that, we will use new machine
learning and deep learning algorithms, such as additive model and explainable AI guided semi-
supervised deep learning technique, to develop predictive models using the ENMs data of the
curated database as the proof of concept. For example, a virtual nanomaterial projection
approach that is based on deep learning, particularly the explainable AI guided semi-supervised
generative adversarial networks, will be especially adept at handling the annotated
nanostructures. In the VINAS database web portal as the final deliverables, the curated ENMs-
bioactivity/property/toxicity data and annotated nanostructures will be shared as downloadable
files for public community to use. And the resulting new deep learning predictive models will be
shared as well. This study provides a new public platform to future data-driven nanoinformatics
modeling studies, especially those machine learning based approaches, and can greatly
advance the rational nanomedicine design and other areas of modern nanoinformatics.