Deep Learning Based Natural Language Processing Markers of Anxiety and Depression - PROJECT SUMMARY / ABSTRACT Major Depressive Disorder (MDD) and Generalized Anxiety Disorder (GAD) are among the primary causes of health burden worldwide. MDD is a leading cause of disability associated with increased morality risk, and both MDD and GAD result in considerable economic costs, loss of functioning, and decreased quality of life. One of the biggest challenges in responding to current calls for population-level screening is to monitor MDD and GAD at a large scale while minimizing assessment burden. Existing assessment methods, however, rely on subjective measures, are based on diagnostic approaches, and are burdensome in the extent needed to characterize MDD and GAD in their heterogeneity, which would require combined evaluation of all symptoms. New methods are needed to accurately assess behavioral health, overcome barriers to monitoring and care, and advance the scientific understanding of depression and anxiety. The proposed study aims to address these gaps by deconstructing MDD and GAD into Digital Biomarkers (DB) based on linguistic features identified by large language models. State of the art artificial intelligence and Natural Language Processing methods allow representation learning of DB from cognitive and emotional domains captured from linguistic information. While effective, passive, and at-scale monitoring are the primary benefits of DB, we will also use them to study relevant Research Domain Criteria (RDoC), including negative valence system reactions and positive valence traits. The study goals are to: 1) Design DB of MDD and GAD symptoms using deep learning methods, by training an attention-based language model on a very large corpus of de-identified psychotherapy treatment transcripts; 2) Examine preliminary performance and feasibility of the DB model in a highly characterized sample of MDD and GAD patients, and compare results with clinician ratings; 3) Explore improvements to the DB model based on research paradigms consistent with RDoC constructs, to further refine DB model pipeline and future deployment in clinical settings. The program of research and training described in this mentored patient-oriented research career development award is aimed at developing systematic digital health approaches to allow dimensional conceptualization of MDD and GAD consistent with RDoC, enhancing the ease and consistency of detection to ultimately support targeted interventions. The proposed project is strongly supported by a multidisciplinary team including the mentorship of Drs. Naomi Simon and Kyunghyun Cho, and the domain expertise of Drs. Paul Glimcher, Tim Althoff, Zhe Chen, and Tanzeem Choudhury. The experience gained from the award will enable the pursuit of future R-level studies focusing on advanced computational psychiatry approaches to further refine DB models to improve passive and objective assessment of behavioral health, and ultimately improve our empirical understanding of depression and anxiety.