Biocatalytic synthesis of complex medicinal small molecules informed by A.I. planning - The structural and stereochemical complexity of natural product analogs and some synthetic small molecules can complicate the development of practical organic synthesis routes, leaving enzymatic, or ‘biocatalytic’, routes as the primary source for drug discovery and production. Proposing these enzymatic routes requires expert intuition and experience, but it is difficult for a single expert or a small group to be knowledgeable about the full substrate scope of every class of enzymatic reactions. Further, the need for experts complicates the scaling of the approach and hinders the development of automated, robotic platforms for enzymatic synthesis of small molecules to accelerate drug discovery and scale up. Computer-Aided Synthesis Planning (CASP) tools in enzyme chemistry use rule-based methods and machine learning to generalize known reactions to predict enzymatic and chemo-enzymatic routes to a desired target molecule. Notwithstanding their success, three key gaps remain that limit the practical use of these CASP tools for drug discovery and production. First, current CASP tools still require substantial expert intervention to propose enzymatic routes to complex molecules starting from simple building blocks. Second, enzymatic CASP tools have limited ability to synergistically use organic and enzyme chemistry to propose hybrid chemo-enzymatic routes. For example, a preliminary tool published by the PI and his co-authors is limited to proposing enzymatic drop-in replacements for organic steps. However, the full effectiveness of biocatalytic retrosynthesis is realized when the introduction of one or more enzymatic steps into a synthesis enables a major redesign of the synthesis route for drastically improved drug production (e.g., shorter routes). Third, the selection of enzymes to catalyze CASP proposed reactions relies heavily on the enzymologists’ knowledge, and with thousands of plausible enzymatic reactions generated in a few minutes by CASP tools, it would be impossible to rely on manual input for evaluating and recommending enzymes. The PI’s research program will address these gaps by investigating three challenges: (i) Development and experimental validation of a multi-step enzymatic synthesis planner that requires minimal expert intervention (ii) Computational planning of de novo chemo-enzymatic synthesis pathways for complex medicinal compounds and their analogs, and (iii) Enzyme sequence-function annotation using machine learning to recommend enzymes for reactions proposed by CASP tools. The PI will curate high-quality databases and combine them with state-of-the-art machine learning algorithms to predict experimentally testable enzymatic routes towards medicinal small molecules. These algorithm-predicted routes will be reconstituted in vitro using purified enzymes and simple building blocks to produce analytically pure samples of product molecules for mass spectrometry- based characterization. The overall goal for the 5-year grant duration is to develop innovative computational methods, bioinformatics tools, and experimental chemical-biology platforms to support the synthesis of complex synthetic molecules and natural product analogs using enzyme reaction catalysis.