Google I/O amps up accessibility with Live Caption, other projects

The tech giant is using AI and voice recognition technology to help people with disabilities live more independent and autonomous lives.

Marguerite Reardon Former senior reporter
Marguerite Reardon started as a CNET News reporter in 2004, covering cellphone services, broadband, citywide Wi-Fi, the Net neutrality debate and the consolidation of the phone companies.
Marguerite Reardon
5 min read
People working on Google's Project Euphonia

Google is using AI technology to help people with speech impairments more easily communicate.


Google is using advances in AI and voice recognition to design new products and apps intended to make life easier for people with disabilities. It highlighted some of that work Tuesday at its annual I/O developer conference.

During his keynote address, Google CEO Sundar Pichai demonstrated the new Live Caption feature, enabled by Android Q , which transcribes in real time any video or audio playing on your phone. Live Caption can work in the background while you watch YouTube, listen to podcasts or video chat via Skype. It will even work with audio and video you record. 

Pichai also highlighted three new efforts that address the accessibility challenges for people with disabilities. Project Euphonia uses AI to help people with speech impairments; Live Relay allows people who are deaf or hard of hearing to make phone calls; and Project Diva makes voice-activated assistants more accessible to people who don't speak.

Google has been working on accessibility issues for some time now. For example, its Maps team has local guides who scout out places with ramps and entrances for people in wheelchairs. Last year at the I/O developer conference, Google announced the Android Lookout app, which helps the visually impaired by giving spoken clues about the objects, text and people around them. 

"Building for everyone means ensuring that everyone can access our products," Pichai said during the keynote. "We believe technology can help us be more inclusive, and AI is providing us with new tools to dramatically improve the experience for people with disabilities."

Here's a closer look at Live Caption and the other accessibility projects announced at I/O.

Watch this: Live Caption adds subtitles to any video or audio clip

Live Caption

Live Caption is enabled by a breakthrough that allows for machine-learning processing power on devices. This means all the information is processed on the device and doesn't require data to be sent over a wireless network to the cloud. This makes the transcription more secure and faster, because data isn't leaving the phone. The feature works even if your volume is turned down or muted. But the transcription can't be saved. It's only on the screen while the content is playing, so you can't save it to review it later.

While the feature was designed with the deaf community in mind, Pichai noted that the feature can benefit everyone in circumstances where you can't turn up the volume on a video. For example, you can watch a video while on a noisy subway or during a meeting. 

Project Euphonia 

This project uses artificial intelligence to train computers to understand impaired speech patterns. Most of us take for granted that when we speak, others will understand us. But for millions of people affected by neurological conditions such as stroke, ALS, multiple sclerosis, traumatic brain injuries or Parkinson's disease, trying to communicate and not being understood can be extremely difficult and frustrating.

Google is working on a fix that can train computers and mobile phones to better understand people with impaired speech. The company has partnered with the nonprofit organizations ALS Therapy Development Institute and ALS Residence Initiative to record the voices of people who have ALS. Google's software takes these recorded voice samples and turns them into a spectrogram, or a visual representation of the sound. A computer then uses common transcribed spectrograms to train the system to better recognize this less common type of speech.

Currently, the AI algorithms only work for English speakers and only for impairments typically associated with ALS. But Google hopes the research can be applied to larger groups of people and to different speech impairments.

The company is also training personalized AI algorithms to detect sounds or gestures, which can then take actions, such as generating spoken commands to Google Home or sending text messages. This may be particularly helpful to people who cannot speak at all.

Project Diva

Digital assistants like Google Home let you listen to a favorite song or movie with just a simple voice command. But for people with disabilities who may not speak, this technology is inaccessible.  

Lorenzo Caggioni, a strategic dloud engineer at Google based in Milan, decided to change that. Lorenzo was inspired by his brother Giovanni, who was born with congenital cataracts, Down syndrome and West syndrome and who is nonverbal. Giovanni loves music and movies, and like many other 21-year-olds likes using the latest gadgets and technology. But because of his disability, he's unable to give the "OK Google" command to activate his Android phone or Google Home device.

In an effort to give his brother more independence and autonomy, Lorenzo and some colleagues in the Milan Google office set up Project Diva to create a device that would trigger commands to the Google Assistant without using his voice. They created a button that plugs into a phone, laptop or tablet using a wired headphone jack that can then be connected via Bluetooth to access a Google Home device. 

Now by simply touching a button with his hand, Giovanni can listen to music on the same devices and services just like his friends and family.

Lorenzo said that the device he created for Giovanni is just the start. The team has plans to attach RFID tags to objects associated with a command that will allow people who don't speak to access other things via Google Assistant.


This drawing illustrates how the technology created in Project Diva can be used to provide alternative inputs to a device powered by the voice-activated Google Assistant. 


Live Relay 

This project helps people who are deaf or hard of hearing to make and receive phone calls. Using on on-device speech recognition and text-to-speech conversion, the software allows the phone to listen and speak on the users' behalf while they type. Because the responses are instant and use predictive writing suggestions, the typing is fast enough to hold a synchronous phone call.

But Live Relay isn't just for people who are unable to hear or speak. It can also be used by people who may be in a meeting or on the subway and can't take a call, but they're able to type instead. Google is also looking at integrating real-time translation capability, so that you could potentially call anyone in the world and communicate regardless of language barriers.

"An important way we drive our technology forward is building products that work better for all of us," Pichai said in his keynote.