Contribution of germline copy number variations to the susceptibility of aggressive prostate cancer in men of African and European ancestry - PROJECT ABSTRACT Inherited genetic variation is a key component in the etiology of prostate cancer (PCa). More than 450 common single nucleotide variants (SNVs) for PCa have been identified in large-scale multi-ancestry genome-wide association studies (GWAS) and rare pathogenic SNVs in >30 PCa candidate genes have been implicated across ancestry populations. Although somatic copy number alterations are commonly observed in prostate tumors and predict poor outcomes, studies are limited in evaluating the contribution of germline copy number variations (CNVs) to PCa risk, due to the technical challenges of detecting germline CNVs from genotype and sequencing data. CNVs, the deletion and duplication of DNA segments ≥50 bp, are a prominent class of genetic variation that has critical impacts on human health and disease. Studies with array intensity data have found evidence of both common (≥1%) and rare (<1%) CNVs associated with risk of total and/or aggressive PCa, but these analyses were limited to only large CNVs (>1kb), mostly conducted in populations of European ancestry, and had no or limited focus on aggressive disease. For a thorough investigation of germline CNVs on PCa risk, we propose to combine imputation and sequencing- based approaches to maximize the ascertainment of CNVs across the frequency and size spectrum in the human genome. Leveraging existing GWAS, whole-exome sequencing (WES) and whole-genome sequencing (WGS) data from large-scale studies with disease aggressiveness well-defined, we are well- powered to examine the association of CNVs with risk of total and aggressive PCa and evaluate the joint contribution of CNVs and SNVs on PCa risk. In Aim 1, we will use the 1000 Genomes Project (1KGP) 30X reference panel to impute common CNVs and test their associations in >24,000 men of African ancestry and >113,000 men of European ancestry with GWAS array data. In Aim 2, we will apply GATK-gCNV to detect rare coding CNVs of all sizes in >12,000 men of African ancestry and >41,000 men of European ancestry with WES data. The aggregate association of rare CNVs will be evaluated in gene-based and gene- set analyses. In Aim 3, CNVs of all sizes and frequencies across the genome will be detected using DRAGEN in >10,000 men of African ancestry and >182,000 men of European ancestry with WGS data. Analyses in WGS studies allow for a comprehensive assessment of CNVs genome-wide, independent replication of risk- associated CNVs identified from GWAS or WES studies, and an integrated analysis of polygenic risk score (PRS) and rare CNVs and rare pathogenic SNVs in candidate genes to understand their joint effects on PCa risk and disease aggressiveness across populations. We expect this study to provide the most comprehensive and well-powered investigation of germline CNVs in PCa across populations to date. This study has the potential to advance the field through discoveries of novel risk variants for PCa that could elucidate underlying biological mechanisms and improve risk stratification across diverse populations.