Project Summary
Modulating the activity of proteases is a central strategy for treating cancer, autoimmunity, and infection. However, the
discovery and design of selective and potent therapeutics targeting proteases (small-molecules and antibodies) largely rely
on inefficient, iterative processes. As a result, it takes several years to develop a protease drug, and even then, most protease
drugs are active site inhibitors that often suffer from low selectivity in distinguishing related proteases. Due to the
complexity of proteolytic dysregulation, restoring homeostasis requires not only selective inhibitors but also ligands that
can reprogram protease selectivity. Unfortunately, no platform exists to engineer protease modulators based on systematic,
and quantitative design principles.
To address these challenges, this proposal seeks to combine for the first time Machine Learning tools, Next-
Generation DNA sequencing, and a yeast-based high-throughput functional screen to accelerate the isolation and
design of nanobody-based protease modulators. The functional selection will perform two tasks: (i) select nanobodies
from synthetic libraries based on a desired function and (ii) correlate ligand: epitope interactions to a functional outcome.
These experiments will generate high-quality datasets that will train machine learning algorithms (ML) to predict the
potency, selectivity, and mechanisms of nanobody-based modulators based on their sequence features alone.
This machine learning-aided strategy will accelerate the discovery of rare and potent protease modulators and
bypass the limitations of structure-based methods. Moreover, curated datasets of protease modulatory nanobody sequences
will provide reference and design guidelines for future experimental and in silico campaigns. This work is of significant
interest to biomedical research and public health and includes select proteases such as Hepatitis C virus protease, MMPs,
transmembrane serine protease 2 (COVID-19), ß-secretase, and insulin-degrading enzyme. Moreover, the proposed studies
provide a foundation to answering fundamental biochemical questions on how synthetic ligands can map and modulate the
functional landscape of proteases and other protein-modifying enzymes.