3/3 Sequencing and Trans-Diagnostic Phenotyping of Severe Mental Illness in Diverse Populations - Project Summary In this new and unfunded study, we will capitalize on the lessons from the past 15 years of psychiatric genomic. Based on these lessons, we propose an exceptionally novel and important set of aims to further knowledge of the genetic architecture of mental illness. We propose to perform whole-exome sequencing and SNP-array genotyping on >150,000 cases with severe psychiatric disorders along with a similar number of controls. It will be large, transdiagnostic, based on patients seen in clinical psychiatry, and comprehensively analyze ultra-rare exonic, rare copy number, and common variation. Because assay costs are prohibitive (on the order of $US 80 million), we are partnering with Regeneron Genomics Center (RGC) that will conduct all genomic assays. NIMH funding is within the $500K direct cost cap at each site. We will: (1) Acquire samples with clinically severe psychiatric disorders. Cases will have lifetime diagnoses of schizophrenia (SCZ), schizoaffective disorder (SAD), bipolar I disorder (BD1), or severe major depressive disorder (sevMDD). Roles: UNC is responsible for data coordination; the sampling sites are ISMMS (the Americas and East Asia) and Cardiff (Europe, Africa, and South Asia) and each will collate samples (i.e., MTAs, ethical approvals, individual consent, harmonize phenotypes, QC DNA). Phase 1 (Years 1-2) will focus on existing samples (N=100K cases). Phase 2 (Years 1-4) will focus on obtaining new samples (N=50K cases), and will enable colleagues from low-income countries to obtain genetic data that would otherwise be impossible. This will help those investigators and greatly increase diversity in genomics research. 2) Genomic assays (Years 1-4). Samples will be sent to RGC in batches from ISMMS and Cardiff. RGC will generate whole exome sequencing and SNP array data. UNC and RGC will jointly conduct alignment, QC, variant calling (SNVs, indels, SVs), and array processing (common SNPs, imputation and CNVs). QC includes assessment of multiple biases and comparison to independent datasets. Deliverable: analysis-ready data frames for rare exonic, rare CNV, and common genetic variation. 3) Analysis for substantive scientific aims. Briefly, the main analytical themes are to identify genetic variation associated with: (a) severe mental illness, (b) specific disorders, and (c) cross-cutting clinical features (e.g., psychosis, treatment resistance, mania, ID). All analyses will be conducted using robust methods/bias control, formally compared to relevant prior studies, and evaluate the impact of all types of measured genetic variation across diverse genetic ancestries. 4) Data sharing will align with NIMH policies via the NIMH Data Archive. Successful completion of the proposed work will markedly increase the number of genes pinpointed by burdens of rare coding variation, rare CNVs, as well as less specific GWAS associations–we will markedly increase knowledge of the genetic architectures of these critically important and burdensome disorders.