Project abstract
X-ray crystallography and cryoEM single particle reconstruction (cryoEM SPR) generate uniquely detailed
structural information that is used to: (1) understand cellular processes at the molecular level, (2) explain and
validate results obtain by other biochemical, biophysical and cell biology methods, and also (3) guide drug design
studies. All these applications are highly relevant to the NIH mission. The proposal aims to advance data analysis
methods for X-ray and cryoEM diffraction and cryoEM SPR so that reliable and informative structural models
can be obtained from micro- and nanocrystals with both techniques as well as from single molecules (particles)
with electron microscopy.
The PI aims to expand X-ray crystallography and cryoEM SPR methods to new areas by highly hierarchical
application of data mining and dimensionality reduction methods. The data richness generated by recent
changes in hardware enables deep exploration of much more elaborate, non-random start algorithms that have
better convergence than the random start methods that are frequently used in computational approaches.
In diffraction methods, one frequently needs to combine data from multiple crystals for successful structure
solution. However, optimal averaging should only consider data that represent the same structural source of
diffraction patterns, so there is a fundamental need to segregate individual samples into distinct groups that are
internally isomorphous. In traditional approaches, complex non-isomorphism patterns result in combinatorial
complexity of data analysis in the presence of incompleteness and low signal-to-noise for individually contributing
datasets. The PI will develop methods addressing this long-standing unsolved problem, with the methods having
potential to also advance the analysis of biologically relevant structural variability that manifests as non-
isomorphism in experimental date. In cryoEM diffraction, data analysis does not yet produce reliable structural
results consistently, de novo structure solution is limited to a small number of projects where direct methods can
be used, and for small molecules, determination of absolute configuration remains a challenge. The PI will
develop and implement experimental and computational solutions to advance modeling of systematic effects
encountered in electron diffraction and to expand phasing approaches in electron crystallography to address
these outstanding problems. The PI will also work on developing estimators of bias magnitude and debiasing
procedures to expand cryoEM SPR so that much smaller particles can be modelled reliably. Finally, the PI will
develop approaches relying on comparative genomics so that structural models can be built and validated at
very low resolution that are currently outside of the reach for molecular interpretation. All research will rely on
the strong expertise of the PI in selected areas.