Investigation of the landscape of immunosequencing and its clinical relevance through novel immunoinformatic approaches - PROJECT SUMMARY The adaptive immune system is responsible for the specific recognition and elimination of antigens originating from infection and disease. It recognizes antigens via an immense array of antigen-binding antibodies (B-cell receptors, BCRs) and T-cell receptors (TCRs), the immune repertoire. Because of the enormous breadth of epitopes recognized by immune repertoires, immune repertoires are extremely diverse and dynamic. Advances in immune receptor sequencing (Rep-seq), such as next generation sequencing, have driven the quantitative and molecular-level profiling of immune repertoires, thereby revealing the high-dimensional complexity of the immune receptor sequence landscape. However, the current analysis tools lack the ability to track and examine the dynamic nature of the repertoire across serial time points or correlate with clinical outcomes. We propose to use network analysis and formulate a novel ensemble feature selection approach, along with other advanced machine learning techniques and statistical approaches (e.g., Bayesian nonparametric approach and shrinkage estimation method), to interrogate and measure immune repertoire architecture longitudinally and in a clinical context. Network analysis is a powerful approach that can help us identify TCRs sharing antigen specificity and highly mutable BCR, which can help to develop or improve existing immunotherapeutics and immunodiagnostics. To integrate gene expression data and scRep-seq data in single-cell setting, we propose to apply the multitable mixed-membership approach to construct a network to increase the resolution of T and B cell clusters. In addition, we assess the importance of shared clusters by introducing Bayes factor to incorporate clonal generation probability and real data abundance. B and T cell responses develop in parallel and influence one another, thus we will further study how BCR/TCR network properties interact, in addition to assessing their individual response separately. We will implement the proposed methods on multiple studies to better illustrate the diversity and richness of the data to demonstrate the flexibility and power of the proposed tools. These studies are unique and generalizable, because they include three cancer types spanning from immunogenic to non-immunogenic in both metastatic and localized settings with different immunotherapeutic modalities. In addition, the proposed methods can be used to study immune response to diseases besides cancer, including respiratory coronaviruses, such as SARS-CoV-2. Therefore, first, we will investigate the landscape of bulk Rep-seq changes over serial timepoints for prostate cancer patients who received Sipuleucel-T and COVID-19 patients. We will develop prognostic/prediction model based on network properties with clinical outcome/characteristics for durvalumab-treated lung cancer patients to elucidate the clinically prognostic features of the network as well classify SARS-CoV-2 infected patients from healthy donors. Moreover, based on unique features of single-cell RNA sequencing, we will classify the immune cells and study the T and B cell responses to immunotherapy (CD40 agonist antibody) for esophageal and gastroesophageal junction cancer patients. Furthermore, we will develop bioinformatics software by incorporating the proposed methods and techniques to tackle the complexity of the immunosequencing data in a translational fashion and provide a comprehensive platform with user-friendly visualization tools.