Automated Surveillance of Overlapping Outbreaks and New Outbreak Diseases - Project Summary / Abstract This project will develop and evaluate new methods for automated detection and characterization of infectious respiratory diseases. The methods will be novel in their ability to detect and characterize (1) multiple, overlapping outbreaks of known diseases, which is a situation that occurs commonly, (2) an outbreak of a new, emerging disease, which can be dangerous, and (3) a combination of 1 and 2 occurring at the same time. The ability to detect a new disease early, in the context of other common outbreaks occurring, may be particularly important if the disease causes serious illness and spreads rapidly in the population. The new methods can also use a wide variety of data to perform outbreak detection and characterization, including emergency department reports, laboratory results, retail thermometer sales in the region, and local health-related tweets. These new methods will be built upon the framework of an existing Bayesian, probabilistic system, which the investigators have developed. This system takes as input data used to perform outbreak detection and characterization, and it outputs the probabilities of different possible disease outbreaks that may be occurring, as well as their characteristics, such as their probable start times and epidemiological curves. A unique aspect of the system is its ability to use data from individual patient clinical reports, such as emergency department reports. The system applies natural language processing to the reports to derive a set of symptoms, signs, and other findings. It then uses these findings and probabilistic disease models to derive a probability distribution over the diseases for each patient. For the many patients seen in the recent past, the system uses their probability distributions as evidence in detecting and characterizing disease outbreaks. The project will be evaluated using simulated data and real data from Allegheny County, Pennsylvania. It will focus on four common outbreak diseases, namely, influenza A, influenza B, respiratory syncytial virus (RSV), and adenovirus. The evaluation will examine how well the system can (1) detect and characterize multiple overlapping outbreaks of disease, (2) detect a new outbreak disease and create an accurate clinical description of it (using a leave-one-out cross validation approach), and (3) use a variety of data types to improve outbreak detection and characterization. The innovation being advanced by this research is a novel, integrated, probabilistic approach for the early and accurate detection of disease outbreaks that threaten public health. The proposed approach has significant potential to improve the information available to clinicians and public health officials, which can be expected to improve clinical and public health decision making, and ultimately to improve population health.