Integrative models of nuclear DNA organization - Project Summary/Abstract The human genome organization influences gene regulation. Aberrant nuclear structures observed in cancer and conservation of common features of genome organization, such as A/B compartments, topologically associated domains, and loops across mammalian evolution, further hint at its role in gene regulation. Yet scientists are only beginning to understand the sequence determinants and regulatory implications of nuclear DNA organization. Next-generation sequencing assays such as Hi-C, ATAC-seq, and RNA-seq assist in this task by measuring genome-wide biochemical activities. However, each of these three assays provides only a partial snapshot of regulatory interactions, and the lack of successful integration has hindered our understanding of the impact of nuclear organization on critical biological functions. The overarching goal of this proposal is to identify the functional consequences of variations in nuclear archi- tecture on transcriptional and post-transcriptional regulation and the role of this variation in human health and disease. Specifically, Aim 1 proposes to identify the sequence determinants of Hi-C contacts using novel deep learning models that predict Hi-C contacts from nucleotide sequences across 80 human tissues (K99 phase). Additionally, Aim 2 proposes to learn the rules of chromatin organization shared across evolution using a deep learning model for translating between Hi-C and ATAC-seq across 100 mammalian species (K99 phase). Finally, Aim 3 proposes to model the impact of variations in the nuclear organization on tissue-specific transcriptional and post-transcriptional regulation in humans using machine learning, long-read RNA-seq, and Hi-C (R00 phase). Together, this work will provide novel, open-source, and interpretable machine learning models to enable the discovery and quantification of the regulatory causes and functional consequences of nuclear DNA organization in healthy human tissues and misregulation of this architecture in disease. The models, resources, and skills learned during Aims 1 and 2 (K99 phase) will be used to accomplish Aim 3 during the R00 phase. The candidate aims to establish an independent research program that bridges the gap between experimental and computational research into genome architecture and gene regulation. She will receive the interdisciplinary training needed from her mentor, Dr. William Noble and her postdoctoral advisory committee, Drs. Erez Lieberman Aiden, William Greenleaf, Anshul Kundaje, and Sheng Wang. In addition, she will participate in career development activities offered through the University of Washington. Her research training, mentor, advisory committee, and academic environment will prepare her well as she transitions to an independent position as an academic researcher.