Next-gen 3D Modeling of Endoscopy Videos - ABSTRACT Endoscopy plays a vital role in the diagnosis and treatment of various medical conditions and is regularly performed millions of times every year. However, endoscopy is challenging in the context of narrow and winding pathways within the body, resulting in incomplete surveys with several ‘blind spots’, and patient discomfort. Researchers have explored various techniques that can assist physicians to perform more efficient endoscopies by enabling better visualization, guidance back to unsurveyed regions, and semi-autonomous procedures. However, these techniques are still clinically infeasible since the required 3D organ reconstructions and endoscope localization is still far from solved. Post-procedure analysis of endoscopic videos to extract and detect meaningful geometric properties, e.g. time-varying measurement of upper-airway cross-sectional area, is also challenging and time-consuming for physicians to perform, and can be automated with 3D reconstruction. Existing approaches for 3D reconstruction from endoscopy videos are primarily based on Simultaneous Localization and Mapping (SLAM) techniques, which are unreliable when faced with typical characteristics of internal organs, such as lack of geometric features, mucus layer reflections, and deformable surfaces. Existing approaches have success rates as low as 40-50% for static shapes. Even these low success rates mostly only apply to easy axial frames and 3D reconstructions may completely fail for deformable surfaces. The goal of the proposed project is to develop a novel 3D reconstruction and localization system that can effectively handle axial and non-axial frames for static and dynamic organs. We propose to develop deep neural network-based computer vision algorithms that can generate 3D meshes from endoscopy videos and determine the position and orientation of the endoscope in near-real time. Our proposed methodology leverages the reflection of the endoscope's light to help recover an organ's shape, along with camera motion (Aim 1), and models time-varying organ deformations (Aim 2). Our focus will be on reconstructing the respiratory tract and colons. However, our approach is entirely general and will apply to other endoscopy reconstructions. We will evaluate our approach on synthetic and real data, validating the results by working with clinical collaborators. In summary, our proposed next-generation 3D modeling system aims to revolutionize endoscopy by providing accurate 3D reconstructions and localization of both static and deformable organs. This advancement has the potential to enhance patient comfort, improve diagnosis accuracy, and enable a wide range of downstream applications in the field of endoscopy, e.g. semi-autonomous maneuvering, guidance to unsurveyed regions, accurate geometric measurements, and better visualization.