To unravel the complexity of biological systems researchers have traditionally studied reductive model
systems like cultured cells or simple molecular simulations. While these reductive model systems can be
cheaper, easier, and/or more ethical to manipulate, findings in them may not translate to the biological systems
of primary interest. This is especially important for drug-discovery, as late-stage failures result in enormous
costs and long development timelines. Excitingly, recent advances in biotechnology and computing have made
more complex model systems—including 3D organoids and large-scale virtual screening—more tractable.
However, an emerging challenge is that standard statistical methods developed to analyze simple model
systems are insufficient to analyze these more complex model systems. Complex model systems are
inherently heterogeneous. The key statistical challenge is to leverage the higher dimensional readouts afforded
by the new technologies to identify the causal mechanisms relevant for translation. When done properly, better
statistical analysis can unlock the potential of new technology to better represent target biological systems with
more precision and less bias.
The overarching theme of my research program is to develop causal inference methods for complex
model systems for pharmacology. Complex systems analyze in my group include morphological profiling,
where robotic confocal microscopes with multiplexed fluorescent dyes are used to rapidly characterize the rich
cellular morphological of individual cells, and large-scale virtual screening, where molecular simulations are
used prioritize compounds from make-on-demand libraries containing tens of billions of molecules. We draw
parallels across these distinct screening platforms, we develop and apply causal inference methods to better
guide translatable discoveries.
Project one: Account for spatial call-to-environment and cell-to-cell interactions in morphological
profiling of organoids in 3D culture. Depending on the downstream application, spatial factors can either define
or confound relevant biological responses. We will develop global and local models for cellular spatial factors
and use them as statistical controls while avoiding selection bias to model the effects of chemical
perturbations.
Project two: Mapping bioactive chemical space for adaptive large-scale virtual screening. AI guided
synthesis prediction is rapidly open new chemical spaces for virtual screening. However, it is not clear how to
take advantage of the increased chemical diversity to best improve target specific or selectivity. We propose to
train high-capacity deep-learning models to represent compounds based their compatibility with ligand binding
sites. This chemical-space map will enable characterizing how perturbations to virtual screening binding sites
and simulation methods effect the distribution of predicted ligands.