Develop Multi-modal Foundation Models for Sepsis Early Detection - Abstract: The PI’s lab focuses on developing effective and scalable machine learning (including deep learning) methodologies to address pressing challenges in healthcare. We have developed foundation models, self-supervised learning methods, multi-level optimization methods, interpretable machine learning (ML) methods, large-scale distributed ML systems, etc. to analyze multi-modal, high-dimensional, and dynamic clinical data, including medical images, electronic health records (EHRs), clinical notes, etc. for medical decision support in diagnosis and treatment. In the next five years, we will develop accurate, efficient, and interpretable multi-modal foundation models for the early detection of sepsis. Sepsis is a life-threatening condition that leads to widespread inflammation, multiple organ failure, and eventually death. Early detection and intervention of sepsis are critical to reducing the risk of death and minimizing the extent of organ damage. A foundation model (FM) is a large-scale ML model, like GPT-4, that is pre-trained on a vast dataset and can be fine-tuned for a wide range of specific tasks and applications. Possessing advanced capabilities for identifying nuanced clinical patterns and signals from large-scale patient datasets, FMs hold immense potential in the early detection of sepsis. Nevertheless, developing FMs for this task presents significant challenges, including scarcity of large-scale EHR data for pre-training, heterogeneity across various data modalities, prevalence of missing values and anomalies in patient records, substantial risk of overfitting during fine-tuning, and lack of interpretability, among other factors. We aim to develop transformative ML methods to address these challenges. First, we will curate large-scale high-quality EHR data for pre-training the FMs, by developing self-supervised learning (SSL) methods, bi-level optimization based methods, and multi-modal diffusion based generative models for imputing missing values, detecting outliers, and synthesizing large-scale pre-training data. Second, we will learn effective representations for EHRs by developing multi-modal Transformer models to handle heterogeneous data modalities, capture long-range dependencies among clinical variables, and incorporate medical knowledge. Third, we will pre- train the multi-modal Transformer model on curated large-scale EHR data by developing novel self-supervised pre- training methods, including a multi-modal masked data prediction method, a hierarchical SSL method, and an automated SSL approach. Fourth, we will fine-tune the pre-trained FM for sepsis early detection, by developing new fine-tuning methods based on meta learning, multi-level optimization, and neural architecture search. Fifth, we will develop interpretable FMs to improve the trustworthiness of detection outcomes. The proposed studies will be conducted on about 29 million patient records, which represent the largest efforts to date to study multi-modal FMs for sepsis. Our proposed research will democratize the early detection of sepsis by making pre-trained FMs accessible to a broader range of clinicians. Smaller medical institutions, which may not have extensive computational infrastructure, can leverage pre-trained FMs to jumpstart their development of in-house, specialized detection models. Besides, the developed technologies can go beyond sepsis and be applied for a broad range of clinical applications.