X
CNET logo Why You Can Trust CNET

Our expert, award-winning staff selects the products we cover and rigorously researches and tests our top picks. If you buy through our links, we may get a commission. Reviews ethics statement

Amazon Alexa transcripts live on, even after you delete voice records

You can delete voice recordings so Amazon can't listen to your conversations with Alexa anymore, but text records are a different story.

Alfred Ng Senior Reporter / CNET News
Alfred Ng was a senior reporter for CNET News. He was raised in Brooklyn and previously worked on the New York Daily News's social media and breaking news teams.
Alfred Ng
4 min read
fear-alexa-1-amazon-echo-plus-promo

While you can delete voice recordings for Amazon, the text records stay.

Chris Monroe/CNET

Amazon doesn't need to hear your voice recordings to know what you've said. It can read them.

After Alexa hears its wake word -- which can vary from "Echo" to "Alexa" to "computer" -- the smart assistant starts listening and transcribes everything it hears. That's why when you check your Alexa dialogue history, you can see text next to the recordings like "How's the Weather" and "Set an Alarm."

Amazon lets you delete those voice recordings, giving you a false sense of privacy. But the company still has that data, just not as a sound bite. It keeps the text logs of the transcribed audio on its cloud servers, with no option for you to delete them.

Amazon said it erases the text transcripts from Alexa's "main system," but is working on removing them from other areas where the data can travel.

Watch this: You deleted your Alexa voice recordings, but the text records are still there

"When a customer deletes a voice recording, we also delete the corresponding text transcript associated with their account from our main Alexa systems and many subsystems, and have work underway to delete it from remaining subsystems," an Amazon spokesperson said in an email.

The new finding comes as privacy concerns have reached a boiling point, with people scrutinizing the tech they use more than ever. People want privacy from tech giants, and are finding that the options companies offer are not really doing the trick. In April, Facebook admitted it still tracked people after they deactivated their accounts.

"Here's what I tell all of our business executives and consumers: 'Delete' is never really 'delete,'" said Theresa Payton, a former White House chief information officer and founder of cybersecurity company Fortalice. "Delete just means that you can't see it anymore."

On Thursday, a group of 19 consumer and public health advocates filed a complaint with the Federal Trade Commission claiming that the Amazon Echo Dot Kids Edition was retaining children's data even after parents deleted the voice recordings. The data stored on Alexa's "Remember" feature wasn't deleted until the parents called customer service to delete the entire profile.

"Amazon markets Echo Dot Kids as a device to educate and entertain kids, but the real purpose is to amass a treasure trove of sensitive data that it refuses to relinquish even when directed to by parents," said Josh Golin, executive director of the Campaign for a Commercial-Free Childhood.

In a statement, Amazon said the Echo Dot Kids Edition is compliant with the Children's Online Privacy Protection Act.

While Facebook has drawn much of the attention for the ways it's gobbled up our personal data, Amazon has increasingly inserted itself into our lives. The company has sold more than 100 million Alexa devices, and it's sitting on a massive amount of text data containing details on people's habits and behaviors that isn't deleted. Amazon's smart speakers are also the most popular choice for buyers.

Amazon Echo dominates the market with about 70% of the market share, while Google Home has about 24% and the Apple HomePod is next at 6%. Google and Apple said they don't keep transcript data indefinitely.

A Google spokesman said both the audio and text entry is removed when a person deletes that data. For Apple, which uses Siri as a voice assistant, the company said voice recordings are never associated with a person or an account, and are tied to a random identifier that you can delete.

"When you turn Siri and Dictation off, Apple will delete the User Data associated with your Siri identifier, and the learning process will start all over again," Apple said on its website.

This retention doesn't just apply to Amazon's own smart speakers -- any third-party device using Alexa as an assistant would be sending that data to Amazon, and people wouldn't be able to delete it. That includes voice data sent to Facebook Portal, a smart speaker released by the social network in November.

Facebook said it deletes the data and transcribed text for its smart assistant when it's activated through the wake word "Hey Portal." But when it comes to interactions with Alexa on the Portal, that's a different story.

"Facebook does not have access to interactions with Alexa on Portal," a Facebook spokeswoman said in an email.

Amazon transcribes your voice data to text through a process it calls Automatic Speech Recognition, which then sends it to another process called the Natural Language Understanding System. The NLU system uses artificial intelligence to figure out what people really mean -- so if you're asking "how is it outside," the system can infer that you mean to ask about the weather.

In a white paper on Alexa privacy and data handling published in July, Amazon said text data was stored "for machine learning purposes." Amazon doesn't delete that data until the machine learning training is completed. The company didn't clarify how long that process is.

Amazon also keeps text records when people set reminders -- so even when the voice recording is deleted, Alexa is still able to send reminders to people based on the text record. Your order history through Alexa also remains, even if you delete the voice recording, the company said.

Beyond the data transcribed from a person's voice commands to Alexa, Amazon also noted that it stored text data on the smart assistant's responses.

In the same document, Amazon stated: "The response can be used by the Amazon team who built the specific skill to ensure that Alexa is providing relevant answers to queries and that the (Text-to-Speech) system is properly translating the text to speech."

While it's not your voice or something you've said, it's not hard to figure out what a person asked based on the answer. It doesn't take much to figure out what the question is from a log of Alexa saying "the weather in New York is cloudy this morning."