A new technique could go a long way toward transforming brain activity into synthesized speech to truly restore the gift of gab to those who've lost the ability to talk.
Neuroscientists at UC San Francisco have created a brain-machine interface that interprets signals from the brain's speech center via a novel two-step process. Instead of, the researchers convert the neural signals into the movements a person's vocal tract uses to create those sounds digitally.
The result is artificial speech that's closer to the real thing, and at a pace that begins to approach a normal rate of conversation.
"We showed that explicitly simulating the movements of the participants' vocal tracts -- that includes the lips, tongue, jaw, larynx -- using a computer simulation... this may produce the optimal decoding of speech from brain activity," Edward Chang, a professor of neurological surgery at UCSF, told reporters Tuesday.
Last year,that picks up on signals sent from the brain to the mouth and jaw.
The new system is being developed in Chang's lab, and the team's progress is outlined in a new paper published Wednesday in the journal Nature.
Researchers conducted the study with a handful of volunteers who already had temporary electrodes implanted in their brains in preparation for neurosurgery to treat epilepsy. They were asked to read several hundred sentences out loud while their brain activity was recorded. The data, along with audio recordings of the participants' speech, allowed the scientists to create a virtual vocal tract. This detailed computer simulation of the anatomy used to create speech could then be controlled by brain activity. The video below shows a few examples of the results.
"For the first time, this study demonstrates that we can generate entire spoken sentences based on an individual's brain activity," Chang said in a statement. "This is an exhilarating proof of principle that with technology that is already within reach, we should be able to build a device that is clinically viable in patients with speech loss."
Currently, many devices for people with severe speech disabilities require spelling out thoughts letter by letter, producing a maximum of 10 words per minute. But a system that can translate entire sentences could allow people to communicate much more rapidly, perhaps even at a speed approach the 100-150 words per minute of natural speech.
"The authors' two-stage approach resulted in markedly less acoustic distortion," write biomedical engineers Chethan Pandarinath and Yahia H. Ali, who were not involved in the research, in a separate commentary also published in Nature. "However, many challenges remain ...The intelligibility of the reconstructed speech was still much lower than that of natural speech."
Josh Chartier, a co-author of the new study, maintains that the levels of accuracy produced by their system improves upon existing technologies, but acknowledges that there is a ways to go to be able to perfectly mimic spoken language.
"We're quite good at synthesizing slower speech sounds like 'sh' and 'z' as well as maintaining the rhythms and intonations of speech and the speaker's gender and identity, but some of the more abrupt sounds like 'b's and 'p's get a bit fuzzy."
Another promising discovery is that the neural code for vocal movements isn't necessarily unique to every individual. This means that someone without natural speech may be able to control a speech prosthesis modeled on the voice of somebody else with intact speech.
"People who can't move their arms and legs have learned to control robotic limbs with their brains," Chartier said. "We are hopeful that one day people with speech disabilities will be able to learn to speak again using this brain-controlled artificial vocal tract."