Explainable Analogical Learning for Oncology Diagnosis and Prediction based on Deep Pathophysiology - To improve cancer diagnosis, risk stratification, and therapeutics, the National Cancer Institute has called for machine learning (ML) methods to mine this heterogeneous Big Data with efficient and interpretable, but currently unavailable models. Among these, cognitive analogy-based models typically offer superior interpretability because of their ability to provide human-like knowledge-based justifications, similar examples and counterfactuals in support of each subject’s personalized recommendations. Interpretability by design has been shown to be crucial for the adoption of clinical decision support systems to ensure trustworthiness and safety of clinical decisions. While analogical learning based on deep pathophysiology is promising for generating highly accurate and explainable results, in a glass-box manner, three gaps remain: (1) integration of heterogeneous molecular pan-cancer data and knowledge for a more precise detection of diagnostic categories and biomarkers, (2) synergies with deep learning architectures (3) extension of the approach to therapeutic recommendations. Thus, there is a critical need to determine the extent to which these models accurately predict cancer progression and severity in a format that is simultaneously explainable, safe, fair, and clinically trustworthy. Leaving these major causal mechanisms of cancer etiology, progression, and therapies unidentified will likely prevent meaningful progress in cancer care. Our overall objective is to determine the predictive and explanatory capabilities of analogical diagnostic and survival analysis models based on deep cancer pathophysiology for accurately detecting cancer and its evolution. Our central hypothesis is that analogical learning grounded in deep cancer pathophysiology and boosted by deep learning architectures accurately and efficiently predicts cancer severity and survival in a safe and trustworthy manner. Our rationale is that analogy-based models grounded on deep pathophysiology will facilitate discovery of precise biomarkers and therapeutics. We propose two specific aims: 1) Demonstrate the effectiveness, efficiency, and interpretability of a diagnostic analogy-based ML model for diagnosing cancer and its subtypes. 2) Determine the effectiveness, efficiency, and interpretability of an analogy-based ML model for cancer survival and risk analysis. 3) Enhance the system’s recommendations actionability through counterfactuals. We expect to build and explain analogical learning diagnostic and survival prediction models of cancer that reflect patient pathophysiological states and integrate oncology knowledge with multiple, highly complex data safely and fairly. This R15 will also enhance the research environment at SUNY Oswego through transformational biomedical informatics research experiences for undergraduate and graduate students. These results are expected to have a positive translational impact by opening new pathways for the development of trustworthy and safe personalized cancer treatment recommendations and plans.