Project Summary / Abstract:
Bladder cancer is the 7th most common malignancy worldwide and has the highest recurrence rate of any cancer
(70%).1–3 Patients with risk factors (smoking, arsenic / chemical dye exposure) and / or hematuria are routinely
screened for bladder cancer via analysis of voided urine. The cellular elements of the urine are deposited to
glass slides, stained, and examined by a cytopathologist for features of bladder cancer using the gold standard
Paris System for Urine Cytopathology.4 However, the Paris System is subjective and the morphology of urothelial
cells is highly varied, making the process difficult and prone to high interobserver variability and human errors
borne of fatigue and overwork.5,6 A more quantitative, automated method of assessing urine cytopathology for
bladder cancer is needed. Machine learning (ML) technologies have proven to be highly effective in image based
classification in pathology, in that ML models operate reproducibly and without bias (unless the training data is
biased) or fatigue. Pap smears are already routinely processed by a semi-automated ML system (BD
FocalPoint), and share many common features with urine cytology specimens in that both are cancer screening
tests relying on cellular and nuclear morphology and prepared by Liquid Based Preparation (LBP, e.g. ThinPrep)
methods. Yet to date no system has been developed to harness ML for bladder cancer in this way, a fact I intend
to change. While it is my strong belief that pathology as a discipline is poised to make the transition to a 100%
digital service, there is significant inertia to overcome to replace the current analog microscope technology. We
must go beyond simply providing a digital alternative by augmenting the skills of the pathologist with ML
algorithms that empower them to work more efficiently, quickly and safely. Urine cytology screening for bladder
cancer is an ideal use case. Thus we sought to create a prototype ML based algorithm, dubbed AutoParis, that
would automate the tabulation of the Paris System. The initial prototype of AutoParis proved to be highly
effective at risk stratifying urine cytology specimens by tabulating statistics related to nuclear to cytoplasmic ratio
(NC ratio, a very important indicator of neoplasia) and cellular / nuclear morphological atypia.7 Deploying
AutoParis as a diagnostic aid to the cytopathologist will require several additional steps. Although I was skilled
enough to code the first iteration of the model, I am reaching the limits of what I can accomplish as a self-taught
programmer and data scientist. In order to complete my work on AutoParis and continue to innovate in the field
of digital pathology and ML, I need a more formalized education in specialized mathematics, statistics, ML theory
and programming. Through this award I will pursue a curriculum of courses at Dartmouth College guided by a
team of expert mentors. My mentors and collaborators were also selected for their ability to help with the testing
and validation of digital decision aids, grant and manuscript prep and lab management. I will emerge from this
experience with the skills I need to be a leader in the future of ML development and its adoption in clinical
medicine.