THE CANCER EPITOPE DATABASE AND ANALYSIS RESOURCE - ABSTRACT Recent years have witnessed a dramatic rise in interest towards cancer epitopes in general, and neoepitopes that encompass mutations arising in a given tumor in particular. Current lines of research examine how the epitope load in a given tumor relates to the success of checkpoint blockade treatments, and how to utilize epitope-based vaccines and adoptive transfer of epitope-specific T cells for personalized therapies. For these purposes, neoepitopes that are recurrently recognized in different individuals are of particular interest, which has also re-ignited interest in epitopes identified in classic tumor-associated antigens. Along with the interest in cancer epitopes, there is also interest in the TCRs and BCRs specifically recognizing them, as these have the potential to be used in therapeutic approaches, and they can aid in basic studies to infer the specificity of T cells or B cells characterized in single cell sequencing data. This resurgence of interest in epitopes has created a need to catalog and make accessible to the scientific community all epitope data, also linked to the biological, immunological, and clinical contexts. The ultimate goal is to come “full circle” and link epitope recognition and immunological readouts to clinical outcomes and treatment strategies alike. In parallel, there is an urgent need to develop resources for epitope prediction and analysis tools that provide access to predictive strategies and provide objective evaluations of their performance in the relevant biological, immunological, and clinical contexts. Recent years have also witnessed the publication of multiple original methodologies that reported sometimes impressive gains in the predictions of cancer epitopes. However, several of these studies were difficult to evaluate, because the methodologies and/or datasets were not fully available in a format that was readily executable. As a result, their performance could not be properly benchmarked on independent datasets. This is also because effective benchmarking on independent datasets requires the assembly of novel datasets of sufficient size and diversity. To overcome all of these information technology challenges, we propose to design and implement the Cancer Epitope Database and Analysis Resource (CEDAR), which will provide a freely accessible, comprehensive collection of cancer epitope and receptor data curated from the literature, and provide easily accessible epitope and TCR/BCR target prediction and analysis tools. As the cancer epitope data are curated, they will be used as a transparent benchmark of how well prediction tools perform, and also to develop new prediction tools for the analysis resource component of CEDAR. CEDAR will leverage our expertise from developing the Immune Epitope Database and Analysis Resource (IEDB), which is fully operational and widely used by researchers globally. CEDAR will directly complement other projects currently funded through the NIH ITCR program that provide resources and tools related to cancer omics data. Finally, we will engage in outreach activities to improve functions, user interfaces, and interoperability with other ITCR tools and promote the use of CEDAR in cancer research.