Project Summary
Despite extraordinary advances in genome engineering, tools for precise and efficient gene correction and
delivery across all cell types remain lacking. Current programmable DNA cleavage tools, such as
meganucleases, Zinc finger nucleases, transcription activator like effector nucleases (TALENs), and CRISPR-
Cas9, rely on recruiting the DNA repair machinery, either using error-prone, non-homologous end joining (NHEJ)
repair for gene knockout, or homology directed repair (HDR) for precise correction. However, HDR is inactive in
post-mitotic cells, such as neurons, and is often inefficient, achieving 50% correction at most. Genome editing
still lacks efficient, robust tools that can insert, delete, and recombine large stretches of DNA sequence. Moreover,
delivery tools are a significant barrier to deploying tissue-specific genomic engineering technologies as current
vehicles, including widely-used viral vectors and liposomal approaches, have limited capacity, offer variable
efficiency, and lack precise tissue tropism. The proposed work involves computationally mining bacteria for new
classes of gene editing enzymes and delivery vectors. We know that natural recombinases and transposases
can mediate programmed DNA rearrangement and insertion, and these classes of enzymes are present in phage
defense and mobile islands in bacteria, which largely remain uncharacterized. Additionally, recent work has
demonstrated that retroviral/retrotransposon gag-like proteins can self-assemble and encapsulate nucleic acid
in extracellular vesicles (EVs) and viral-like particles (VLPs) for cell-to-cell communication. As with defense
islands, these proteins can also be systematically catalogued via a bioinformatic pipeline and experimentally
characterized. The proposed work will focus on three main goals: 1) signatures for phage defense, mobile genetic
activity, or VLP-forming activity will be mined to build a machine learning pipeline for comprehensively
discovering novel gene clusters from novel metagenomic sequences with a focus for proteins that can
manipulate nucleic acid or self-assemble, 2) candidate gene clusters will be cloned from metagenomic samples
and undergo high-throughput screening using biochemical and bacterial assays for gene editing and capsid
formation, as well as engineering to hone activity, and 3) the most promising candidates will be evaluated for
activity in mammalian systems with assays for highly efficient gene insertion and VLP formation. The work will
elucidate novel bacterial phage defense and VLP biology, and result in the development of new technologies for
more efficient genetic manipulation and gene delivery. Moreover, this gene exploration and engineering
framework will serve as a model for discovering diverse bacterial gene clusters and defense systems, evaluating
biochemical activity across a range of assays, and converting these findings into high impact biotechnologies.
The developed technologies will accelerate the pace of biomedical research and enable greater exploration of
basic biological processes and disease mechanisms.