Program Director/Principal Investigator (Da, Yang):
Project Description
MOTIVATION AND OBJECTIVES
Complex genetic mechanism of quantitative traits may include gene interaction effects commonly referred to as
epistasis and multiple genetic factors with small effects. This is among the most difficult genetic areas due to difficulties
to discover and the need of large samples to detect many small effects. The U.S. Holstein cattle have the largest genomic
evaluation program in the world with 3,852,580 genotyped cattle by March 2021, and the number of genotyped cattle
increased at a pace of ~600,000 per year. Among the genotyped cows, phenotypic records were available for 43 traits
covering production, reproduction, health, longevity, and body shape and structure. Majority of these traits have been
collected and evaluated for decades. In addition, more new traits may become available continuously. The unprecedented
sample sizes of the genomic selection data of U.S. Holstein cattle provide an unprecedented opportunity for understanding
and utilizing complex genetic mechanisms of quantitative traits. Preliminary results using 294,076 Holstein cows for 8
traits already had interesting discovery that would have been unimaginable, including a single chromosome region
interacting with all chromosomes, intra-chromosome epistasis covering an entire chromosome, and nearly exclusively
inter-chromosome epistasis for one trait. With methods and computing tools to study complex genetics developed by PI’s
group as well as encouraging preliminary results, this proposed research is an unprecedented large-scale study on genomic
discovery and prediction for 43 traits mostly with one million cows using complex multigenic models that have never been
attempted before, are expected to generate many new discoveries, and have potential to advance multigenic knowledge to
a new level. The long-term goal of this project is to identify multigenetic factors underlying quantitative traits, to
understand how multigenetic factors affect phenotypes, and to apply multigenetic mechanisms and factors to predict
phenotypes. Specific aims are as follows.
Aim 1: Large-scale discovery of global pairwise epistasis effects for 43 traits covering production, reproduction,
health, and body shape and structure by testing four types of epistasis effects per SNP pair, additive × additive, additive ×
dominance, dominance × additive, and dominance × dominance using million cow genome-wide association study
(GWAS) for most of the 43 traits. These tests will identify the most important epistasis type underlying each trait, and
chromosome regions and genes with the most significant epistasis effects for epistasis network with unprecedented
statistical confidence. All four types of epistasis effects will be further analyzed as intra- and inter-chromosome epistasis
effects to investigate their potential association with the trait heritability and response to genetic selection. Selected
chromosome regions with important epistasis effects will be subjected to fine mapping using increased sample size and
high SNP density by imputing.
Aim 2: Evaluation of the contributions of complex genetics effects to the phenotypic variance and the accuracy of
genomic prediction. Genomic heritability of each type of genetic effects will be estimated as a measure of the contribution
to the phenotypic variance. Observed prediction accuracy from validation studies is used as an objective measure for the
relevance of any type of genetic effects to the accuracy of genomic prediction, and any genetic effect affecting prediction
accuracy is considered relevant to the phenotype. The combination of this genomic estimation and prediction under
complex model with the GWAS approach will yield uniquely high confidence results of multigenic mechanisms underlying
quantitative traits.
Aim 3: Evaluation of prediction accuracy of complex models for traits that benefit from any or a combination of
dominance, global epistasis and locally high-order epistasis effects using large sample validation studies. This process will
lead to recommendations for routine applications of the prediction models with complex genetic effects in genomic
evaluation.
BROADER IMPACTS
The novel discoveries in multigenic mechanisms of quantitative traits in Holstein cattle are expected to advance the
understanding of complex genetic mechanism of quantitative traits in diploid species and benefit the scientific community
in research, teaching and training. The research approach will facilitate opening new direction for studying and utilizing
multigenic mechanisms of quantitative traits. New methods for genomic prediction with complex genetic mechanism may
increase the efficiency of genetic selection for some of the most difficult traits facing the dairy industry such as fertility
and health. Solutions from this project will enhance collaboration between academics and U.S. dairy industry, and
increased prediction accuracy of genomic prediction using complex genetic effects may translate into substantial economic
benefits for U.S. dairy industry.
CREATIVITY, ORIGINALITY, MECHANISM TO ASSESS SUCCESS
This is the first large-scale complex genetic analysis using the most complex models ever attempted for many traits.
Creative and original ideas include the integration of the large-sample GWAS for detecting epistasis effects with genomic
OMB No. 0925-0001/0002 (Rev. 03/2020 Approved Through 02/28/2023) Page Continuation Format Page