Genomic data hold the promise of revolutionizing our understanding and treatment of human disease. Multiple
barriers stand between the acquisition of the data and realizing these and other benefits. Rapid accumulation
of genomic data far exceeds our capacity to reliably interpret genomic variation. New developments in artificial
intelligence and machine learning, combined with increased computing power and domain knowledge, provide
hope for the deployment of enhanced computational tools in both basic research and clinical practice. Use of
these methods critically depends upon reliable characterization of their performance.
The Center for Critical Assessment of Genome Interpretation (C-CAGI) will address these needs, through
objective evaluation of the state of the art in relating human genetic variation and health. CAGI has had five
editions since 2010 with 50 challenges posed to the community taken on by hundreds of predictors, leading to
scores of publications about prediction methods and their assessment. We propose for C-CAGI to continue to
advance the field of variant interpretation through the following Specific Aims:
1. Develop community experiments to evaluate the quality of computational methods for interpreting
genomic variation data. C-CAGI will conduct community experiments in which participants make bona fide
blinded predictions of disease related phenotypes on the basis of genomic data. We will engage a diverse
predictor community to spur innovation. The CAGI Ethics Forum will vet studies to ensure that privacy and
sharing maintain the highest standards and will educate the community.
2. Assess the quality of current computational methods for interpreting genomic variation data;
highlight innovations and progress at interactive conferences. Predictions will be evaluated by
independent assessors, who will be supported by new assessment approaches from C-CAGI. Results will be
presented at CAGI experiment conferences with deep technical engagement, which will be interleaved with
reflective CAGIâ¿ meetings that create an environment for a comprehensive evaluation of the field, facilitating
identification of major bottlenecks and problems faced by the current genome interpretation approaches.
3. Broadly disseminate the results and conclusions from the CAGI experiments and analysis. C-CAGI
will outreach to the broader scientific and clinical community through its publications, and the creation of a
calibrated reference integrated into the most common workflows for ready adoption. CAGI will also be
represented at international meetings with presentations and workshops.
4. Operate effectively and responsively. C-CAGI will operate efficiently as it closely interacts with hundreds
of participants. CAGI will build upon a robust information infrastructure that securely facilitates data
dissemination, prediction submission, and assessment.