Title: Development of a new computational method for predicting drug - target interactions using a
TSR-based representation of 3-D structures
Project Summary:
Protein and drug 3-D structures play a pivotal role in drug design and discovery. At the same time, it is
very challenging to extract meaningful structural information and convert it to knowledge. In the last forty years,
since the development of the first automated structural method, approximately 200 papers have been
published using different representations of structures. Each has its uniqueness and limitations. Our project
adds to the existing knowledge base with a new TSR (Triangular Spatial Relationship)-based representation of
protein 3-D structures using Ca atoms. Triangles are constructed with the Ca atoms of a protein as vertices.
Every triangle is represented by an integer, which we denote as "key". A key is computed using the
length, angle and vertex labels based on a rule-based formula, which ensures assignment of the same key to
identical TSRs across proteins. Since the keys are constructed among three residues, they are considered
inter-residue keys. Our results clearly demonstrate successful clustering of proteins that matches their
functional classifications in most cases and successful identification of known and new structural motifs.
Although we have been successful using Ca, two facts inspired us to continue developing intra-residue keys
to represent structures of side chains. The first fact, which emerged when we studied triad of serine proteases,
is that we found a key that represents two different triads of chymotrypsin. However, only one of them is the
true triad, when the interactions between the side chains are considered. The second fact is that drugs often
have close interactions with side chains of proteins. Thus, the overall objectives of this proposal are to develop
an effective method for representing 3-D structures of proteins and drugs that is customized for the study of
drug and protein interactions. The ways to represent protein and drug structures, and to predict drug and
protein interactions, are innovative. We have made our computational tools available for the scientific
community and will continue to do so. Our central hypothesis is that complex 3-D structures can be divided into
a set of triangles, the simplest primitives to capture the shape. Each triangle is converted to an integer that
uniquely captures its essential characteristics. It means that a 3-D structure can be represented by a
multiset of integers (bag of keys). The rationale of this proposal is derived from the results of our studies that
used inter-residue keys to obtain TSR-based representation of protein structures. The method built based on
this TSR idea has important advantages over the existing methods. Five specific Aims will be pursued:
development of TSR-based key representation of amino acids and corresponding representation mechanism
for drugs, integration of inter- and intra-residue keys for identifying drug-binding sites, predicting drug – target
interactions, and integration of computational calculations with experimental data. The proposed research will
have significant impacts on research in the fields of comparing protein 3-D structures and accelerating drug
development for pharmaceutical industries.