Study demonstrates the possibilities of a future speech prosthesis for humans
It is possible to re-create a bird’s song by reading only its brain activity, shows a first proof-of-concept study from the University of California San Diego. The researchers were able to reproduce the songbird’s complex vocalizations down to the pitch, volume and timbre of the original.
Published June 16 in Current Biology, the study lays the foundation for building vocal prostheses for individuals who have lost the ability to speak.
“The current state of the art in communication prosthetics is implantable devices that allow you to generate textual output, writing up to 20 words per minute,” said senior author Timothy Gentner, a professor of psychology and neurobiology at UC San Diego. “Now imagine a vocal prosthesis that enables you to communicate naturally with speech, saying out loud what you’re thinking nearly as you’re thinking it. That is our ultimate goal, and it is the next frontier in functional recovery.”
The approach that Gentner and colleagues are using involves songbirds such as the zebra finch. The connection to vocal prostheses for humans might not be obvious, but in fact, a songbird’s vocalizations are similar to human speech in various ways. They are complex, and they are learned behaviors.
“In many people’s minds, going from a songbird model to a system that will eventually go into humans is a pretty big evolutionary jump,” said Vikash Gilja, a professor of electrical and computer engineering at UC San Diego who is a co-author on the study. “But it’s a model that gives us a complex behavior that we don’t have access to in typical primate models that are commonly used for neural prosthesis research.”
The research is a cross-collaborative effort between engineers and neuroscientists at UC San Diego, with the Gilja and Gentner labs working together to develop neural recording technologies and neural decoding strategies that leverage both teams’ expertise in neurobiological and behavioral experiments.
The team implanted silicon electrodes in male adult zebra finches and monitored the birds’ neural activity while they sang. Specifically, they recorded the electrical activity of multiple populations of neurons in the sensorimotor part of the brain that ultimately controls the muscles responsible for singing.
The researchers fed the neural recordings into machine learning algorithms. The idea was that these algorithms would be able to make computer-generated copies of actual zebra finch songs just based on the birds’ neural activity. But translating patterns of neural activity into patterns of sounds is no easy task.
“There are just too many neural patterns and too many sound patterns to ever find a single solution for how to directly map one signal onto the other,” said Gentner.
To accomplish this feat, the team used simple representations of the birds’ vocalization patterns. These are essentially mathematical equations modeling the physical changes–that is, changes in pressure and tension–that happen in the finches’ vocal organ, called a syrinx, when they sing. The researchers then trained their algorithms to map neural activity directly to these representations.
This approach, the researchers said, is more efficient than having to map neural activity to the actual songs themselves.
“If you need to model every little nuance, every little detail of the underlying sound, then the mapping problem becomes a lot more challenging,” said Gilja. “By having this simple representation of the songbirds’ complex vocal behavior, our system can learn mappings that are more robust and more generalizable to a wider range of conditions and behaviors.”
The team’s next step is to demonstrate that their system can reconstruct birdsong from neural activity in real time.
Part of the challenge is that songbirds’ vocal production, like humans’, involves not just output of the sound but a constant monitoring of the environment and constant monitoring of the feedback. If you put headphones on humans, for example, and delay when they hear their own voice, disrupting just the temporal feedback, they’ll start to stutter. Birds do the same thing. They’re listening to their own song. They make adjustments based on what they just heard themselves singing and what they hope to sing next, Gentner explained. A successful vocal prosthesis will ultimately need to work on a timescale that is similarly fast and also intricate enough to accommodate the entire feedback loop, including making adjustments for errors.
“With our collaboration,” said Gentner, “we are leveraging 40 years of research in birds to build a speech prosthesis for humans–a device that would not simply convert a person’s brain signals into a rudimentary set of whole words but give them the ability to make any sound, and so any word, they can imagine, freeing them to communicate whatever they wish.”
Paper: “Neurally driven synthesis of learned, complex vocalizations.” Co-authors include Ezequiel M. Arneodo, Shukai Chen and Daril E. Brown, all at UC San Diego.
This work was supported by the National Institutes of Health (grant R01DC018446), the Kavli Institute for the Brain and Mind (IRG no. 2016-004), the Office of Naval Research (MURI N00014-13-1-0205) and a Pew Latin American Fellowship in the Biomedical Sciences.
Declaration of interests: Vikash Gilja is a compensated consultant of Paradromics, Inc., a brain-computer interface company.