Carcinogens and evolutionary pressures generate unique patterns in the types of somatic mutations observed
in the DNA of cancer cells. Mutational signature analysis investigates these patterns. Following promising initial
successes, the scope and variety of questions scientists ask in mutational signature analysis far exceed the
availability of robust data analytic tools to address them.
The overarching goal of this project is to develop, test, and apply a class of statistical models able to compre-
hensively support rigorous statistical inference on most of the important scienti?c questions arising in this novel
?eld. Speci?cally, we will generalize current approaches in fundamental ways to incorporate previously proposed
signatures, and account for multiple studies/conditions, covariates, paired/longitudinal data and batch effects.
We will develop a comprehensive free and open-source R package, conforming to Bioconductor standards, al-
lowing users to implement our analyses and their visualizations. Methods will leverage the investigative groups'
extensive experience in Bayesian modeling, multi-study modeling, multivariate analysis, and statistical genomics.
Development will proceed hand in hand with discovery efforts within the Dana Farber multiple myeloma genomics
program of which the PI is integral part.
Given the fast growth in whole exome and whole genome sequencing of tumors, and the corresponding growth
in the use of mutational signature analysis, we expect our tools to have a substantial impact, by enabling cancer
researchers to a) carry out more accurate analysis and b) more reliably evaluate the accuracy of their results.
Thus we expect this work to substantially accelerate the rate of discovery and clinical translation of the biology of
mutational signatures in cancer.