Project Summary
There is an unmet need in medicine and basic sciences for accurate atomic structures of proteins. This need
surpasses the capabilities of traditional high-resolution experimental methods. With machine learning advances,
structure prediction algorithms are poised to provide atomic models for these areas in the near future. Yet, the
gaps in prediction algorithms limit accuracy and reliability, particularly for large multi-domain proteins, protein
complexes, and flexible proteins. Our proposal, Towards Accurate protein structure Predictions with SAXS
TechnologY (TAPESTRY), will create technology to increase reliability and improve accuracy of protein
structure predictions through experimental validation, particularly for difficult proteins.
TAPESTRY is innovative by combining our strengths in high-throughput synchrotron SAXS (Small Angle
X-ray Scattering) data collection and analysis with the Critical Assessment of protein Structure
Prediction (CASP), which assesses structure predictions against “gold standard”, not-yet-released crystal
structures every two years. Through CASP, we take advantage of the collective protein folding knowledge
of the global community of structure prediction scientists.
Our approach is strategic. We provide SAXS data for CASP, giving prediction scientists access to
experimental data. We develop analytical and experimental tools, designed for prediction scientists to
overcome current gaps that limit the use of SAXS data. We test these tools against our TAPESTRY databases
of standard proteins, with corresponding crystal structures, SAXS data, and predicted models. Finally, we
evaluate the robustness of our technology through CASP and obtain an unbiased assessment of our tools
and the state of the field. As a first step, we target well-folded proteins (Aim 1) and proteins with disordered tails
(Aim 2) in this proposal.
The feasibility of our technology proposal is supported by our current data and proofs-in-concepts, our
beamline capabilities, and proven experience in SAXS analysis. We show that experimental SAXS data,
which contains distance information that can act as restraints in protein structure prediction algorithms, match
crystal structures of well-folded proteins and score predictions based on topological accuracy. We show cases
in CASP13 (2018) when SAXS data improved the fold of predicted models. SAXS data collection is rapid (10
seconds), does not require labeling or crystallization, and is available at no cost to the scientific community. We
have proven experience in developing informative and effective SAXS analytical tools.
Our long-term goal is to enable biomedical researchers to input an amino acid sequence and rapidly obtain an
experimentally validated and accurate atomic model(s) that reflects the protein conformation(s) in solution. If
TAPESTRY is successful, the increased availability of such atomic models will have strong and broad potential
to advance biomedical research and impact all areas of biology in which proteins are involved.