Cctbx.xfel: Serial crystallography computational methods aimed at biomolecular function - Project Summary / Abstract Under present R01 funding, the Sauter group has focused on the use of serial crystallography to discover the structure-function relationships of large biomolecules. While specializing in technology related to data analysis, we have enabled new science at X-ray free-electron laser (XFEL) lightsources, where the X-ray pulse structure permits time resolution down to picoseconds, at normal physiological temperature where the full range of available molecular conformations can be revealed, and with essentially no radiation damage, ideal for studying metalloenzymes in their functional redox states. Our software CCTBX.XFEL gives a workflow to process big data up to 100 Terabytes on supercomputing platforms, while applying algorithms that are customized for the “still shots” of serial crystallography, distinct from the rotating crystal geometry used at synchrotron sources. Considering the enormous cost of XFEL crystallography in terms of labor and competitive beamtime allocation, the 10 minute turnaround time of our pipeline affords a significant mitigation of the risks, allowing the experimental team to adjust data collection parameters and reprioritize samples. We have collaborated on the development of new instrumentation to observe enzymatic reaction progress triggered by laser pump, mix-and-inject, or gas incubation. We’ve implemented new serial crystallography modalities such as X-ray emission spectroscopy to monitor catalytic metal sites, and chemical crystallography to determine small molecule structures. Under the proposed R35, the goal is to gain greater detail in the molecular model, in comparison to present results. There is potential for improvement because today’s algorithms still inherit assumptions from traditional crystallography (such as monochromatic beam), while the plan is to introduce a new Bayesian inference model, accounting for every detail of the diffraction pattern down to the pixel level. We will also develop the related capability of using the protein crystal as an X-ray spectrometer, thus revealing detail about the electronic environment of catalytic metals (using X-rays tuned to the metal absorption edge), a measurement that has not yet been achieved at ambient temperatures or in the time domain. Our goals also include the observation of diffuse scatter (diffraction intensity between the lattice of Bragg spots), which reflects correlated motions within and between protein molecules, and further to probe macromolecule flexibility by infrared beam temperature-jump experiments. Finally, we hope to explore new computational directions for high-throughput interpretation of cryo-electron tomography (cryoET) data. The unifying theme between serial crystallography and cryoET is the desire to learn the biological role of structural variability. The X-ray data processing improvements will allow us to model the small changes that contribute to a reaction mechanism, e.g., an amino acid sidechain rotation, a change in water occupancy, or even the displacement of a single electron. High-throughput cryoET will sample the variability and heterogeneity of cellular structures, giving a spatiotemporal understanding of living systems and how they respond to genes, regulation and environment.