A genome-wide genealogical framework for statistical and population genetic analysis - PROJECT SUMMARY Genetic studies have improved our understanding of disease etiology and treatment. However, there are at least two shortcomings preventing current studies from reaching their potential in elucidating the genetic architecture of complex traits for all humans. First, current genetic studies largely ignore the genetic relationships among individuals in a study. Many of these relationships may be distant, but nonetheless can be connected on genealogical trees at every position of the genome through a coalescent process. The collection of such (unobserved) trees is encoded by the ancestral recombination graph (ARG). Second, genetic studies are generally biased towards relatively homogeneous, continental, populations such as European or East Asian populations, in part due to a lack of methods tailored towards admixed populations. In this proposal we aim to develop new methods to address both shortcomings. Our framework leverages recent breakthroughs that allow, for the first time, accurate and scalable estimation of ARGs. In Aim 1 we will leverage a new estimator of relatedness based on the ARG that can retain more information of relatedness from incomplete genetic data (e.g. array genotype data) compared to the current standard estimator for relatedness. We will use this estimator to estimate trait heritability and cross-population genetic correlation of complex traits and diseases in humans, as well as to correct for confounding due to population structure in genome-wide association studies. In Aim 2, we will develop an association-testing framework that uses the ARG to identify trait-associated genomic regions and prioritize trait-associated haplotypes. This principled approach can naturally account for allelic heterogeneity and has the potential to improve the power of association studies through lowered multiple testing burden, which is particularly important for understudied populations where recruitment of participants is more challenging. Finally, in Aim 3 we will develop a population genetic framework that uses ARGs to model the admixture history of a population. Using this model, we will develop new ways to detect genes that have responded to recent selection and identify complex traits that have evolved under different kinds of phenotypic selection. Importantly, our framework will address these evolutionary questions in each ancestral component of the admixed population. Throughout each Aim we will benchmark our methods with extensive simulations. We will also evaluate our methods empirically using large- scale real-world human genetic data. Finally, we will apply our methods to genotyping and sequencing data from admixed populations to discover new loci associated with human diseases and/or experienced natural selection in the past. In summary, we will mine the wealth of information from the ARG and address fundamental population- and human-genetic questions, particularly in understudied and admixed populations.