Can one size fit all? - High-Resolution 3D Genome Spatial Organization Inference with Generalizable Models - ABSTRACT Chromosome conformation capture techniques, particularly Hi-C, have benefitted the study of the spatial proximity, interaction, genome conformation of cells, and genome architecture leading to the development of several three-dimensional (3D) chromosome structure modeling methods. Many observations become more apparent in 3D because some relationships—for example, evolutionary constraints or cell-to-cell variability of mammalian chromosome structures—cannot be surmised by genomic sequences alone. Although members of the bioinformatics community, including the PI, have developed many algorithms for reconstructing 3D genome structures based on population Hi-C data, we lack computationally effective methods to precisely model at a high-resolution (<=5 kilobase (kb)). One difficulty is the exponentially increasing number of fragments at this resolution. My work in the last five years provides the premise for the current proposal and uniquely positions my interdisciplinary research program to carry out the proposed studies. The PI proposes to conduct leading research to overcome this challenge and address important questions that remain about how (and why) 3D genome structures across cells are organized and about the relationship between 3D structure and genetic and epigenetic mechanisms for gene expression. During the next five years, the PI’s objective is to develop computational and machine learning-based models to further highlight the hierarchical organization of, and the refined structures within, the genome. The PI proposes to explore the development of innovative models for 3D chromosome and genome reconstruction using a novel noninstance-based generalizable model based on a graph convolutional neural network to generalize across resolutions, chromosomes, restriction enzymes, and cell populations. Given the PI’s background, track record, and productivity in the genomic research field, the computational objectives defined here are not only feasible but also computationally and biologically rewarding to the bioinformatics community at large. Computationally, our methodology will resemble a robust one-size-fits- all model that can be sufficiently trained at a lower computational cost on less complex data and be used across multiple higher resolutions for 3D structural modeling. Biologically, our proposed reconstruction algorithms will aid diseases diagnosis, prevention or treatment by shedding light on the relationship between long-range interaction and gene expression in human cells and how disruptions in physical interactions between genes and the enhancers or silencers could aberrantly alter gene expression. Thus, this research demonstrates the potential impact of knowing the architecture of the genome to the understanding of biological processes and human disease. Once the proposed objectives are completed, the PI will ultimately have been well established as an independent investigator, and will have proposed leading robust, high-performing, and efficient computational algorithms that will provide new vertical advancement in the chromatin genomics research field.