A data resource for studying DNA methylation in diverse human tissue types - ABSTRACT Epigenetic modifications of DNA, including DNA methylation (DNAm), play a central role in determining cell identity and function, and epigenetic perturbations contribute to aging and disease. Characterizing variability in human epigenomes and the causes of such perturbations is critical for understanding disease mechanisms. “Epigenome-wide” studies of DNAm have demonstrated that effects of genetic variants on DNAm are pervasive in the human genome (i.e., methylation quantitative trait loci, mQTLs), suggesting epigenetic mechanisms of disease susceptibility. These studies have also identified loci where DNAm is associated with age, exposures, and disease, enabling the development of “epigenetic clock” algorithms, exposure signatures, and potential biomarkers of disease. However, these prior studies have (1) focused on a limited number of tissue/cell types and (2) captured a small fraction of the ~28 million CpGs in the human genome. Thus, we currently lack a truly whole-genome resource to study DNAm in large numbers of non-blood human tissue samples. Large studies of DNAm in diverse tissue types are needed to improve our understanding of variability in human epigenomes and its determinants. In response to this need, we propose to create a multi-tissue DNAm data resource using samples obtained from tissue donors by the Genotype-Tissue Expression (GTEx) Project. The GTEx data resource currently provides genome-wide data on genetic variation and gene expression for >15,000 tissue samples from >900 donors (>50 tissue types), data which has provided an unprecedented view of tissue-specific gene expression and regulation in humans. Complementary data on DNAm would provide an integrated view of genetic effects on both transcription (eQTLs) and the epigenetic background on which transcription occurs (mQTLs). A multi-tissue DNAm data resource would also support many research questions that rely on comparable DNAm data across tissue types, including research on aging, environmental epigenetics, and genome structure and function. Our first aim is to provide genome-wide data on DNAm for ~2,100 GTEx tissue samples (~210 samples for each of 10 unique tissue types) to the scientific community, using whole-genome bisulfite sequencing. Our second aim is to create and disseminate a comprehensive catalog of genetic effects on DNAm in human tissues (mQTLs). We will present these results on the GTEx portal, leveraging existing data visualization tools. Our third aim is to promote the use of this DNAm data resource. We will provide online tutorials on data access and GTEx portal features, host workshops at scientific meetings that will train data users, monitor use of the resource, and respond to the needs of data users. The unique multi-tissue DNAm data resource we propose to create will be of broad interest to the genomics community and relevant to many diseases of public health importance. We anticipate this resource will be highly complementary to existing GTEx data and will greatly enhance the long-term impact of the GTEx project on biomedical science and human health.