Deep learning based integration of protein structure and sequence for class-I immunogenicity prediction and interpretation - Project Summary: Deep learning based integration of protein structure and sequence for class-I immunogenicity prediction and interpretation. In the dynamic landscape of cancer immunotherapy, neoantigen vaccines have emerged as a revolutionary tool for personalized treatment. However, neoantigen vaccine success hinges on the precise prioritization of class-I neoantigens that are immunogenic, an attribute observed in only a fraction (~10-20%) of the total neoantigen peptides administered to the patient. Current prediction methods are primarily sequence-based and do not account for the intricate interplay between the structural and sequential characteristics of the peptide- MHC complex. The clinical implications of this shortcoming are significant, as the selection of mutations for neoantigen vaccines must be precisely optimized, given the necessity to restrict encapsulated neoantigen delivery to a select number of mutations (approximately 20-35). Therefore, it is of high clinical interest to increase the percentage of immunogenic neoantigens administered to the patient. An innovative approach to do this may be to integrate both peptide sequence data and pMHC structural data to improve class-I immunogenic epitope immunogenicity prediction. To this end, my project pioneers the widescale integration of structural and sequence data to enhance the accuracy and interpretability of immunogenicity prediction. I have generated a dataset of ~24,000 peptide-MHC complexes using AlphaFold2, with each complex paired to an experimentally determined immunogenicity measurement from IEDB. In Aim 1, I will optimize the graph construction to encode the pMHC complex and combine it with peptide sequence encodings to improve model performance. I have generated these peptide sequence encodings as part of my preliminary work. By developing a model that captures both sequence and structural information, I will also be able to identify potential relationships between different features of the peptide-MHC complex to create a comprehensive, structurally organized embedding to better interpret physicochemical properties underpinning immunogenicity. In Aim 2, I will validate my model’s predictions on external epitope immunogenicity data across cancer neoantigens and infectious disease epitopes, as well as validate with cancer patient neoantigen survival data in the context of immune-checkpoint therapy. I will then use the model to identify substructural pMHC motifs for improved bio-structural interpretation. This work seeks to improve epitope-based vaccine design by improving the prediction of immunogenic epitopes, accelerating the progress of clinically effective neoantigen and epitope-based vaccine design strategies.