Nearly 7.5 million people live without the ability to vocalize effectively. Existing augmentative and alternative
communication (AAC) technology provides some function for these individuals, typically by converting
physical gestures, eye movements or text into words that can be acoustically synthesized or visually displayed.
However, a key limitation of these devices is that they do not involve natural mechanisms of speech production
and therefore can be less intuitive as substitutes for the human vocal system. Consequently, they can suffer from
lexical ambiguity, lack of emotional expression, and difficulty in conveying intent. There remains an unmet need
to restore the natural mechanisms of speech production for the vocally impaired. To meet this need, we propose
to develop a first-of-its-kind AAC system that restores personalized, prosodic, near real-time vocalization based
on surface electromyographic (sEMG) signals produced during subvocal (i.e., silently mouthed) speech. In Phase
I, we demonstrated the ability to recognize orthographic content and categorize emphatic stress between phrases
subvocalized by (n=4) control and (n=4) post-laryngectomy participants with a 96.3% word recognition rate and
91.2% emphatic stress discrimination rate, respectively. Subvocal speech corpus transcripts were synthesized
into prosodic speech using personalized, digital voices unique to each participant, then evaluated by naïve
listeners (n=12). Listeners consistently rated our sEMG-based digital voice as having greater intelligibility,
acceptability, emphasis discriminability and vocal affinity than the state-of-the-art electrolarynx (EL) speech aid
used by laryngectomees. Having achieved these capabilities with lengthy post-processing of single phrases, we
now aim to advance this technology in Phase II by solving the more fundamental challenges of transcribing
prosodic speech and tracking variations in intonation and timing in near-real-time to restore conversational
interactions in everyday life. To achieve this goal, our team of engineers at Altec Inc. is partnering with the
world’s leading provider of personalized digitized voice for AAC (VocaliD, Inc), and world-class laryngeal
cancer clinical experts (Massachusetts General Hospital) to develop algorithms for transcribing prosodic speech
and tracking variations in intonation and timing throughout narratives, monologues and conversations (Aim 1);
design MyoVoice™ system for near real-time mobile use (Aim 2); and evaluate the prototype system for
conversational efficacy (Aim 3). Our milestone is to demonstrate within-subject improvements in ease-of-use,
functional efficacy, and social reception amongst post-laryngectomy participants using our sEMG-based digital
voice when compared to their typical EL speech aid. The final deliverable will consist of a single 4-contact sensor
veneer and cross-platform, near-real-time mobile software that can operate on an AAC tablet or mobile device.
Once commercialized, our vision for the future of this device is for a person—who is facing the devastating need
to undergo laryngectomy—to have their voice banked and subvocal models trained such that immediately
following surgery, they can receive a custom MyoVoice™ system to restore their original voice.