Advancing Digital Pathology through Novel Machine Learning Methodologies - PROJECT SUMMARY/ABSTRACT
Pathology is focused on providing medical diagnoses and prognoses based on laboratory methods to guide
patient treatment and management. Microscopy is fundamental for pathologists to examine tissues and cells.
Despite numerous advancements, there have not been many changes in the last century in terms of how
microscopy images are used in pathology. The current approach in anatomic pathology lacks standardization
and relies on the cognitive burden imposed on pathologists to manually evaluate millions of cells across hundreds
of slides in a typical workday. Deep learning-based methods have recently shown encouraging results for
analyzing microscopy images. However, they rely on standard computer vision architectures and pipelines,
which are limited due to the required time and cost of slide digitization and the computational constraints of
analyzing huge high-resolution images. Furthermore, developing accurate deep learning models requires having
access to large databases of labeled microscopy images, which is challenging. In this application, new
methodologies are proposed to take advantage of the unique characteristics of histopathology datasets and the
range of features in histology microscopy images to address these limitations. This project presents a novel
approach based on generative adversarial networks for difficulty translation to generate augmented data with
realistic, rare, and hard-to-classify histopathological patterns. This approach will mitigate data imbalances in
annotated histology datasets and improve the performance of deep learning models for histological classification,
particularly for uncommon and difficult-to-classify cases. Furthermore, a novel curriculum learning approach for
histology image classification will be developed based on the range of classification difficulty among
histopathological patterns and multi-annotator labeled datasets. This approach trains on progressively harder-
to-classify images, as determined by annotator agreement, and significantly improves the performance of the
resulting deep learning models without requiring additional data or computational resources. In addition, a self-
supervised knowledge distillation method will be developed to enhance the efficiency of histology image
classification. As large, labeled datasets are scarce, this method uses a self-supervised approach to distill feature
extraction capabilities at a high resolution into a student model operating at a lower resolution by leveraging
unlabeled datasets. The resulting distilled student models can achieve high classification accuracy on low-
resolution histology images while saving a significant amount of time and resources on digitization efforts and
required computational resources. The proposed methods in this application remove current bottlenecks in deep
learning applications for digital pathology. Therefore, the results from this project could have a major impact on
new opportunities that use deep learning technology in clinical workflows and integrate histopathological
information with other clinical and molecular data to improve patients' diagnoses, prognoses, and treatments.