PROSPECT: Premalignant Oral Lesions Pathology and Epigenetic Risk Prediction Tool - This application validates a minimally invasive, multi-stage artificial intelligence (AI)-based cytologic, histologic and epigenetic biomarker to identify oral premalignant lesions (OPL) with high risk of progression to oral cavity squamous cell carcinoma (OSCC), using a massive existing data set and a prospective study in OPL patients. OSCC patients suffer from a 5-year mortality rate of 40%, accounting for one death per hour. Up to 10% of the U.S. population has oral lesions, of which a small proportion are high-risk OPLs that transform to OSCC. Major challenges exist in monitoring and risk stratifying these OPLs. While grade is used to recommend treatment, its prognostic value is low. There is currently no reliable clinical, histologic or molecular marker to determine individual risk in patients with the same dysplasia grade. The quality of OPL grading in hematoxylin eosin (HE) stained slides is based on the availability of a surgeon and pathologist, typically absent in resource-constrained locations. Currently, noninvasive sampling that can be used in settings with restricted access to care, have not been validated to replace tissue diagnosis. Herein we design a staged approach to diagnosing and monitoring OPLs, using cytology, histology, and epigenomics in a step-wise fashion in order to minimize diagnostic invasiveness. Our approach will automate and improve prognostication of OPL risk by using deep learning. Our central hypothesis is that histologic and molecular patterns within OPLs can be risk-stratified using deep learning to individualize prognosis in patients with the same apparent OPL grade. We test our hypothesis through a series of scientific aims, which taken together, create a paradigm shift in management of OPLs by establishing a layered strategy that escalates the complexity of the diagnostic test (from brush swabs to surgical biopsy) with escalating cancer risk. Our study proceeds with the following 3 aims. 1) Train deep learning based digital pathology models for oral premalignant lesion progression risk prediction. We will use a longitudinal cohort with known cancer outcomes to train deep learning models using cytology and histology, respectively, to predict risk of progression to OSCC. 2) Validate and merge cytology and histology with epigenomic signatures to create the multi-stage, multi-modal PROSPECT score using multiple cohorts. We will refine the digital cytology and histology biomarkers in a separate existing validation cohort with known cancer outcomes. Next, we will use brush swabs to predict biologically relevant epigenomic alterations in the transition from OPL to OSCC. We will create the PROSPECT score (Premalignant Oral Lesions Pathology and Epigenetic Risk Prediction Tool), which is a risk score that combines the cytologic, histologic, and epigenetic scores sequentially with clinical information to predict risk of cancer progression. 3) Test the PROSPECT score in a prospective, multi-institutional clinical study of OPL patients.. We will refine our PROSPECT score to perform robustly in brush biopsies and tissues from a separate prospective cohort of OPL patients, which will be recruited during the course of this study from four clinical sites. Testing of the PROSPECT score in this prospective cohort will set the stage for a large-scale clinical study to use non-invasive brush swabs to monitor OPL with higher accuracy than current clinical standards.