Bridge2AI: Voice as a Biomarker of Health - Building an ethically sourced, bioaccoustic database to understand disease like never before - Our group aims to integrate the use of voice as biomarker of health in clinical care by generating a substantial multi-institutional, ethically sourced, and diverse voice database linked to multimodal health biomarkers to fuel voice AI research and build predictive models to assist in screening, diagnosis, and treatment of a broad range of diseases. Data collection will be made possible by software through a smartphone application linked to electronic health records (EHR) and other health biomarkers such as radiomics, and genomics, and supported by federated learning technology to protect data privacy. Based on the existing literature and ongoing research in different fields of voice research, our group has identified 5 disease categories for which voice changes have been associated to specific diseases and around which we aim to center the data acquisition efforts: 1. Vocal Pathologies (Laryngeal cancers, Vocal fold paralysis, Benign laryngeal lesions) 2. Neurological and Neurodegenerative Disorders (Alzheimer’s, Parkinson’s, Stroke, ALS) 3. Mood and Psychiatric Disorders (Depression, Schizophrenia, Bipolar Disorders) 4. Respiratory disorders (Pneumonia, COPD, Heart Failure, OSA) 5. Pediatric diseases (Autism, Speech Delay) Specific Aim #1: Data Acquisition Module: - To build a multi-modal, multi-institutional, large scale, diverse and ethically sourced human voice database linked to other biomarkers of health that is AI/ML friendly to fuel voice AI research Specific Aim #2: Standard Module: - To introduce the field of acoustic biomarkers by developing new standards of acoustic and voice data collection and analysis for voice AI research. Specific Aim #3: Tool Development and optimization - To develop a software and cloud infrastructure for automated voice data collection through a smartphone application that allows non-invasive, user-friendly, high quality voice data collection while minimizing human manipulation. This will include integrated acoustic amplifiers and acoustic quality standardization. - To implement Federated Learning technology to allow analysis of multi-institutional data while minimizing data sharing and preserving patient privacy Specific Aim #4: Ethics Module - To integrate existing scholarship, tools, and guidance with development of new standard and normative insights for identifying, anticipating, addressing, and providing guidance on ethical and trustworthy issues from voice data generation and AI/ML research and development to clinical adoption and downstream health decisions and outcomes. - To develop new guidelines for consenting to voice data collection, voice data sharing and utilization in the context of voice AI technology Specific Aim # 5: Teaming Module: - To build bridges between the medical voice research world, the acoustic engineers, and the AI/ML world to promote the integration of tangible clinical application for Voice AI algorithms Specific Aim #6: Skills and Workforce Development Module - To develop a unique curriculum on voice biomarkers of health and the development, validation, and implementation for AI models that are FAIR and CARE - To create a community of voice AI researchers, especially those from underserved communities, and foster collaborations to promote application of ML for Voice Research - To engage a broad range of learners with competency assessment and mentorship