Collaborative Research: DMS/NIGMS 2: Novel machine-learning framework for AFMscanner in DNA-protein interaction detection - Quantifying TF-DNA binding, including locations, distributions, and binding mechanism is an important
first step toward the understanding of gene regulatory machinery. In this proposal, we will develop an
atomic force microscope (AFM)-based single-molecule imaging method for the detection and
quantification of TF-DNA binding. The new technique brings the methods of mathematics and statistics to
bear on the technological breakthrough in an experimental system. This new technology is inherently
different from classical single-molecule imaging approaches, which solely rely on the technician’s
experimental skills. Combining mathematics, statistics, bioengineering, and chemical engineering, this
proposal creates a perfect platform for multidisciplinary research by merging analytics, biology, and
engineering. We see this as a translational effort of what started as a lab-bench discovery into a new
biotechnology tool, as the proposed machine learning (ML) methods combined with robot hands pave a
revolutionary path to the massive production and fully automated system for precise TF-DNA imaging.
Analytically, we face three challenges: construction of high-throughput images, prediction of TF binding
region, and force decomposition to recover the binding mechanism. To attack these problems, we will (1)
develop smoothing spline diffusion and annealing process for image super-resolution, (2) develop novel
reinforcement learning algorithm for automatic TFBSs searching, and (3) develop graph ANOVA method
to compare the TF-DNA binding mechanism. Our efforts in these areas should lead to (1) fundamental
advances in image super-resolution and reinforcement learning algorithms which enjoy both algorithm
simplicity and theoretical rigorous; (2) development and refinement of the technology for the rapid and
precise genome-wide identification and quantification of TF-DNA binding sites using AFM technology; (3)
visualization of not only TF-DNA binding sequence and location but also 3-D structures; (4) investigation
of TF-DNA interactions under nearly physiological conditions by controlling the reaction conditions
experimentally; and most importantly; (5) prototyping of a fully automatic system for potential technology
translation. This system permits accurate detection of TF-DNA binding with a rapid response that requires
essentially no user intervention for field deployment and data capture.