Abstract
Psychiatric geneticists have discovered hundreds of common single nucleotide polymorphisms (SNPs)
associated with schizophrenia (SCZ) through genome-wide association studies (GWAS). Brain expression
quantitative trait loci (eQTL) can successfully explain some of those genetic associations. Differences in genetic
association between disparate ancestral populations are often reported, however, it is not known whether such
population differences originate from different underlying risk genes or from different allele frequencies and linkage
disequilibrium of the same risk genes. Our central hypothesis is that genetic regulation of gene expression
within brains, as represented by eQTL, can explain the disease GWAS signals. Population structure
influences eQTL as it influences GWAS. The major assumption is that the biological foundation of GWAS and
eQTL is the S-E-D relationship, short for SNP-Gene Expression-Disorder. Functional interpretation of GWAS signals
relies on the discovery of S-E-D relationships. Due to a lack of brain transcriptome data from populations of non-
European descent, interpreting SCZ GWAS results for variants uncommon in other populations presents a
significant challenge. To discover the causes of these population differences, we will develop a transcriptome
dataset of a new brain collection from East Asians (EA, N = 578) combined with samples from the existing
PsychENCODE project (EA, N = 18). We will also use data of individuals of African ancestry (AFR, N = 411) from
the PsychENCODE projects. Along with data from those of European descent (EU), which dominates the
PsychENCODE (N =1,321) projects, we will have brain transcription data of three major populations in the world.
Our specific aims include: 1) to relate SNPs to gene expression (the S-E portion of the S-E-D networks), we will
develop and compare eQTL and coexpression networks of postmortem brains from three populations, EA, AFR and
EU; 2) to connect SNP-expression to SCZ GWAS signals (the S-E-D aspect), we will use brain eQTL data to
explain SCZ GWAS of EA, EU and AFR populations and to identify SCZ risk loci that also serve as regulators of
brain gene expression; 3) to develop a novel cross-population predixcan algorithm that can infer genetically
regulated gene expression, and identify those differentially expressed in patients. The algorithm will be used
to re-analyze existing PGC SCZ data, and use Vanderbilt University data to replicate the findings. This study will
improve the understanding of the genetic contribution of population diversity to SCZ risk. It is critical for developing
more precise diagnoses and treatments for the benefit of diverse populations, for addressing health disparity.