A Robust, Secure Framework to Effortlessly Bind Distributed Databases and Analysis Tools into Tightly Integrated Translational Drug Discovery Computational Platforms - PROJECT SUMMARY
Collaborative Drug Discovery, Inc. (CDD) proposes to develop Cloud Workspaces for Drug Discovery – a
novel informatics framework that will enable scientists engaged in drug discovery and translation to effort-
lessly, robustly, and securely integrate disparate databases and computational tools distributed across multiple
systems and vendors into highly-efficient, custom-tailored computational platforms. Our innovative technology
will solve a critical problem that hinders drug discovery and translation efforts: scientists in this field typically
need to combine chemical and biological data from several sources, run them through multiple software pack-
ages that specialize in different types of analyses and visualization, then ideally store the results of the analyses
together with their underlying experimental data. Today, this type of integration is difficult and expensive to
accomplish and typically fragile, creating a large barrier to (i) exploiting the rapidly increasing number of high-
quality public-access data repositories and (ii) evaluating promising new analytical tools and strategies. Mono-
lithic platforms offer to solve this problem by bringing everything together under one roof, but they are ex-
tremely expensive and they limit flexibility: no package can offer every capability. The alternative approach –
stringing together discrete resources – evolved during the era of desktop computing and does not translate well
to modern cloud-based workflows and in particular to the challenges of performing computationally intensive
operations that require combining large datasets distributed across remote systems.
Cloud Workspaces (CW) aims to combine the strengths and avoid the weaknesses of these two extremes. CW
will in essence allow users to easily create their own individualized cloud-hosted solutions tailored to their
unique requirements and workflows. Our approach offers the performance, robustness, and ease of use of a
monolithic software solution, but without the associated inflexibility and vendor lock in. It offers the flexibility
and openness of combining discrete resources, but without the associated integration challenges and fragility,
and it advances the pipelining approach to embrace cloud-based models and to encompass distributed data
resources without compromising performance or security. The Specific Aim for Phase 1 is to implement a proof
of concept CW container environment including bidirectional synchronization of biological and chemical data
(transferring only new or modified data) with correct association of chemical identifiers.
CW will enable scientists to exploit the burgeoning array of public data repositories and sophisticated new
computational algorithms and apply them in practice to drug discovery and translation projects. By eliminating
the challenges of installing, configuring, and maintaining software, CW will allow scientists to quickly explore
and adapt new methodologies. Conversely, the technology will support new marketing channels and business
models for software vendors, enabling faster and broader adoption of advanced techniques. The market for the
technology ranges from academics to small and medium size companies to the large pharmaceutical firms.
!
!