Sunday, February 15, 2026 2/15/2026

Continued Development and Maintenance of the MG-RAST Metagenomics Pipeline

Award Number: R01AI123037
ORGANIZATION: NATIONAL INSTITUTE OF ALLERGY & INFECTIOUS DISEASES
OPDIV: NIH
AWARD CLASS: DISCRETIONARY
AWARD ACTIVITY TYPE: SCIENTIFIC/HEALTH RESEARCH (INCLUDES SURVEYS)
PERIOD OF PERFORMANCE START DATE: 03/01/2016
PERIOD OF PERFORMANCE END DATE: 02/28/2022

Group Awards By:

View Award Description

Continued Development and Maintenance of the MG-RAST Metagenomics Pipeline - DESCRIPTION (provided by applicant): Metagenomics, the study of microbial populations sampled directly from the environment, affords avenues for discovering novel enzymes via microbial profiling; using microbial shifts as predictors for health; or gauging the sustainabilityof human operations like mineral mining. However, the volume of metagenomic data is large (e.g., the metagenome of a human's gut microbiota is about 1 Gigabasepairs in size) and the processing that needs to be done to extract meaning out of the large datasets is significant, such as to identify what organisms' genomes are in the sample (taxonomic annotation) and what are they doing (functional annotation) via comparisons with continually updated knowledge databases. These numbers are only growing as experimentalists demand more and more metagenomic analysis runs. Borne out of this need, our MG-RAST (Metagenomics-Rapid Annotation) portal, an open-source, high-throughput, metagenomics service, has been a major community resource since 2008, housing over 160K datasets and 40K users. However, since its original design, MG-RAST has witnessed the frenetic development of next-generation sequencing technologies, drastically altered computing landscape (both in hardware and software), changed requirements in terms of number of users and datasets' volumes and diversity, increasing complexity of pipeline components, and requirements for higher throughput. To adapt to this, MG-RAST has been continually modified. Modifications included upgrading the pipeline components with several algorithmic improvements; deploying a customized data and workflow management system - the SHOCK object store and AWE workflow manager; and porting MG-RAST to a cloud-based distributed architecture. Notwithstanding our continual, albeit ad-hoc system improvements, our pilot studies have indicated the need for a comprehensive redesign of MG-RAST to keep pace with the needs of the rapidly advancing field of metagenomics. Our proposed enhancements are based on expressed user requirements, new usage patterns, and flexibility to incorporate new tools, especially for the compute-intensive similarity analysis for queried sequences. Through this project, we propose to accomplish MG-RAST's transformation via (i) improving its functionality and data reproducibility; (ii) improving its software quality and performance through automated monitoring and generation of test suites; and (iii) moving toward a federated infrastructure for metagenomics data. Overall, the successful accomplishment of our aims will support alternate metagenomics service models through federation of services and data and result in a robust state-of-the-art metagenomics resource. Federation in biomedical pipelines is in general a powerful direction to leverage the expertise of diverse user-bases and, reciprocally, benefit its users. Thus, MG-RAST, as a state- of-the-art pipeline, will be capable of supporting an ever increasing user-base, handling larger and more varied datasets, and evolving in concert with new genomics technologies. This, with the ultimate goal, to accelerate advances in end-user applications, e.g., personalized medicine, tailored to the patient's microbiome.


Issue Date FY	Funding FY	Legal Entity Name	Legal Entity Address	Legal Entity City	Legal Entity State	Legal Entity Zip Code	Legal Entity COUNTY	Legal Entity COUNTRY	Assistance Listing	Award Code	Budget Year	Action Date	Action Type	Action Amount

Issue Date FY: 2023 ( Subtotal = -$1,429 )
2023	2020	PURDUE UNIVERSITY	2550 NORTHWESTERN AVE STE 1900	WEST LAFAYETTE	IN	47906	TIPPECANOE	USA	Allergy and Infectious Diseases Research	000	5	4/18/2023	NON-COMPETING CONTINUATION	-$1,429
														Subtotal = -$1,429

Issue Date FY: 2020 ( Subtotal = $704,873 )
2020	2020	PURDUE UNIVERSITY	155 S GRANT ST	WEST LAFAYETTE	IN	47907	TIPPECANOE	USA	Allergy and Infectious Diseases Research	000	5	2/14/2020	NON-COMPETING CONTINUATION	$704,873
														Subtotal = $704,873

Issue Date FY: 2019 ( Subtotal = $715,936 )
2019	2019	PURDUE UNIVERSITY	401 SOUTH GRANT ST	WEST LAFAYETTE	IN	47907	TIPPECANOE	USA	Allergy and Infectious Diseases Research	000	4	4/3/2019	NON-COMPETING CONTINUATION	$715,936
														Subtotal = $715,936

Issue Date FY: 2018 ( Subtotal = $715,933 )
2018	2018	PURDUE UNIVERSITY	401 SOUTH GRANT ST	WEST LAFAYETTE	IN	47907	TIPPECANOE	USA	Allergy and Infectious Diseases Research	000	3	2/15/2018	NON-COMPETING CONTINUATION	$715,933
														Subtotal = $715,933

Issue Date FY: 2017 ( Subtotal = $743,525 )
2017	2017	PURDUE UNIVERSITY	401 SOUTH GRANT ST	WEST LAFAYETTE	IN	47907	TIPPECANOE	USA	Allergy and Infectious Diseases Research	000	2	2/8/2017	NON-COMPETING CONTINUATION	$743,525
														Subtotal = $743,525

Issue Date FY: 2016 ( Subtotal = $775,775 )
2016	2016	PURDUE UNIVERSITY	610 PURDUE HALL	WEST LAFAYETTE	IN	47907	TIPPECANOE	USA	Allergy and Infectious Diseases Research	000	1	2/29/2016	NEW	$775,775
														Subtotal = $775,775

Grand Total All Awards = $3,654,613

Top

All Categories

About

Search

Reports

Data Submission

Award Information

Continued Development and Maintenance of the MG-RAST Metagenomics Pipeline

Award Number: R01AI123037

ORGANIZATION: NATIONAL INSTITUTE OF ALLERGY & INFECTIOUS DISEASES

OPDIV: NIH

AWARD CLASS: DISCRETIONARY

AWARD ACTIVITY TYPE: SCIENTIFIC/HEALTH RESEARCH (INCLUDES SURVEYS)

PERIOD OF PERFORMANCE START DATE: 03/01/2016

PERIOD OF PERFORMANCE END DATE: 02/28/2022

Federal Websites

Department of Health & Human Services

HHS Operating Divisions

HHS Staff Divisions

Download A Document Viewer