Collaborative Research: DMS/NIGMS 2: New statistical methods, theory, and software for microbiome data - Advancement in high-throughput sequencing technology allows the characterization of the microbiome via
either marker-gene (e.g., 16S rRNA gene) amplicon sequencing or metagenomics shotgun sequencing.
Consequently, the scientific community is increasingly appreciative of the important role that the
microbiome community plays in many human health and disease conditions. Despite its popularity, the
field of microbiome and metagenomics studies, however, has not yet reached the maturity attained in
other established molecular epidemiology fields, such as cancer biomarker discovery and genome-wide
association studies for making the leap from omics survey to rational microbiome-based therapeutics.
One of the primary limitations to leveraging this large body of microbiome and metagenomics data is
computational and statistical challenges. Among these is the technical nature of the data, including high
dimensionality, sparse count or compositional data structure, relatively small sample size, and complex
dependence/correlation structure such as phylogenetic relatedness. To combat these challenges, this
proposal seeks to develop statistical methods, theory, and computational tools to accurately characterize
microbial communities within and across large studies while maintaining both statistical rigor and
biological relevance. This project develops new statistical methods, theory, and software to characterize
microbial communities within and across large studies accurately. Specifically, motivated by biomedical
and biological problems encountered in microbiome studies of skin diseases, autism spectrum disorder,
and infant growth, the investigators will develop statistical methodology for (1) mapping microbial taxa that
influence clinical outcomes of interest in a powerful and robust pattern; (2) learning the correlation
structure among microbial taxa to decode the complex networks and interactions among the microbiome
community; (3) a new mediation analysis for microbiome studies with high-dimensional microbial profiles
and other omics profiles such as metabolomics. Successful completion of this proposal will fill the gap
between the burgeoning research interests in microbiome studies and the need for more analytical tools.
This proposal will improve the understanding of the underlying microbiome mechanism of many health
and disease conditions, which is critical to designing microbiome-based interventions for prognostic,
diagnostic, and treatment purposes.