Appearing Nov. 6 in the journal Nature Communications, the new technology might one day help people unable to talk due to neurological disorders regain the ability to communicate through a brain-computer interface.
“There are many patients who suffer from debilitating motor disorders, like ALS (amyotrophic lateral sclerosis) or locked-in syndrome, that can impair their ability to speak,” said Gregory Cogan, Ph.D., a professor of neurology at Duke University’s School of Medicine and one of the lead researchers involved in the project. “But the current tools available to allow them to communicate are generally very slow and cumbersome.”
Imagine listening to an audiobook at half-speed. That’s the best speech decoding rate currently available, which clocks in at about 78 words per minute. People, however, speak around 150 words per minute.
The lag between spoken and decoded speech rates is partially due the relatively few brain activity sensors that can be fused onto a paper-thin piece of material that lays atop the surface of the brain. Fewer sensors provide less decipherable information to decode.
To improve on past limitations, Cogan teamed up with fellow Duke Institute for Brain Sciences faculty member Jonathan Viventi, Ph.D., whose biomedical engineering lab specializes in making high-density, ultra-thin, and flexible brain sensors.
For this project, Viventi and his team packed an impressive 256 microscopic brain sensors onto a postage stamp-sized piece of flexible, medical-grade plastic. Neurons just a grain of sand apart can have wildly different activity patterns when coordinating speech, so it’s necessary to distinguish signals from neighboring brain cells to help make accurate predictions about intended speech.
After fabricating the new implant, Cogan and Viventi teamed up with several Duke University Hospital neurosurgeons, including Derek Southwell, M.D., Ph.D., Nandan Lad, M.D., Ph.D., and Allan Friedman, M.D., who helped recruit four patients to test the implants. The experiment required the researchers to place the device temporarily in patients who were undergoing brain surgery for some other condition, such as treating Parkinson’s disease or having a tumor removed. Time was limited for Cogan and his team to test drive their device in the OR.
“I like to compare it to a NASCAR pit crew,” Cogan said. “We don’t want to add any extra time to the operating procedure, so we had to be in and out within 15 minutes. As soon as the surgeon and the medical team said ‘Go!’ we rushed into action and the patient performed the task.”
The task was a simple listen-and-repeat activity. Participants heard a series of nonsense words, like “ava,” “kug,” or “vip,” and then spoke each one aloud. The device recorded activity from each patient’s speech motor cortex as it coordinated nearly 100 muscles that move the lips, tongue, jaw, and larynx.
Afterwards, Suseendrakumar Duraivel, the first author of the new report and a biomedical engineering graduate student at Duke, took the neural and speech data from the surgery suite and fed it into a machine learning algorithm to see how accurately it could predict what sound was being made, based only on the brain activity recordings.
For some sounds and participants, like /g/ in the word “gak,” the decoder got it right 84% of the time when it was the first sound in a string of three that made up a given nonsense word.
Accuracy dropped, though, as the decoder parsed out sounds in the middle or at the end of a nonsense word. It also struggled if two sounds were similar, like /p/ and /b/.
Overall, the decoder was accurate 40% of the time. That may seem like a humble test score, but it was quite impressive given that similar brain-to-speech technical feats require hours or days-worth of data to draw from. The speech decoding algorithm Duraivel used, however, was working with only 90 seconds of spoken data from the 15-minute test.
Duraivel and his mentors are excited about making a cordless version of the device with a recent $2.4M grant from the National Institutes of Health.
“We’re now developing the same kind of recording devices, but without any wires,” Cogan said. “You’d be able to move around, and you wouldn’t have to be tied to an electrical outlet, which is really exciting.”
While their work is encouraging, there’s still a long way to go for Viventi and Cogan’s speech prosthetic to hit the shelves anytime soon.
“We’re at the point where it’s still much slower than natural speech,” Viventi said in a recent Duke Magazine piece about the technology, “but you can see the trajectory where you might be able to get there.”
This work was supported by grants from the National Institutes for Health (R01DC019498, UL1TR002553), Department of Defense (W81XWH-21-0538), Klingenstein-Simons Foundation, and an Incubator Award from the Duke Institute for Brain Sciences.
CITATION: “High-resolution Neural Recordings Improve the Accuracy of Speech Decoding,” Suseendrakumar Duraivel, Shervin Rahimpour, Chia-Han Chiang, Michael Trumpis, Charles Wang, Katrina Barth, Stephen C. Harward, Shivanand P. Lad, Allan H. Friedman, Derek G. Southwell, Saurabh R. Sinha, Jonathan Viventi, Gregory B. Cogan. Nature Communications, November 06 2023. DOI: 10.1038/s41467-023-42555-1