The Protein Data Bank (PDB) contains more than 100,000 3D structures of proteins, many of which are directly
relevant to human health and disease. Up to 10% of these structures contain carbohydrates as ligands or as
post-translational modifications. While numerous tools exist to curate protein 3D structural data, no such tools
have been adopted by the PDB as part of the validation checks performed upon coordinate deposition. This
oversight has resulted in a large number of errors and inconsistencies in annotation and structure in the
carbohydrate structural data. Here we will work with the World Wide PDB (wwPDB) to develop and implement
tools to address these issues as part of a broader carbohydrate remediation initiative at the PDB.
At the present time there are two serious problems that hinder the utilization of carbohydrate data stored in the
protein data bank (PDB):
1) There is an unacceptably high proportion of errors in the deposited coordinates.
2) No convenient interface exists for searching for carbohydrate structures in the PDB.
We will generate a software tool called “GlyProbity" for checking the accuracy and internal consistency of 3D
structures of carbohydrates, and then implement this tool for the data remediation. In addition, GlyProbity will
be provided as a stand-alone interface that may be used by crystallographers to validate carbohydrate
structures prior to deposition in the PDB and by other researchers to validate structures obtained in any
manner. Lastly, we will create a search interface, “GlyFinder” to be implemented at GLYCAM-Web that will
greatly simplify the task of locating relevant carbohydrate containing structures. Taken together, these aims
should significantly impact the development of glycomimetic therapeutics, as well as the generation of
structure/function relationships in glycobiology, and will be essential for achieving interoperability with
additional databases or data mining services in the future.