Ever since Alexa and Google Assistant first burst onto the scene and started populating people's homes with smart speakers and other gadgets outfitted with always-listening microphones, people have wondered whether anyone other than their AI assistant of choice was listening in.
Well, the answer is yes -- bothand have admitted that they hire contractors to listen to anonymized user audio clips for the purposes of improving their respective assistant's capabilities.
That might have seemed like an obvious assumption to some, but to many, it was a wake-up call. That's true not just for Amazon and Google, but for all of the gadgets and services that need our data to function. What are these companies doing with our data? How are they protecting it? Are they sharing any of it with third parties?
What Amazon and Google say
"We only annotate an extremely small sample of Alexa voice recordings in order improve the customer experience," an Amazon spokesperson. "For example, this information helps us train our speech recognition and natural language understanding systems, so Alexa can better understand your requests, and ensure the service works well for everyone."
The spokesperson added that employees can't directly access identifying information about the people or accounts associated with the recordings.
"All information is treated with high confidentiality and we use multi-factor authentication to restrict access, service encryption, and audits of our control environment to protect it," the spokesperson said.
Meanwhile, Google chalks it all up to the complexities of building a fully capable, multilingual voice assistant.
"As part of our work to develop speech technology for more languages, we partner with language experts around the world who understand the nuances and accents of a specific language," David Monsees, product manager for Google Search, said in a blog post Thursday. "These language experts review and transcribe a small set of queries to help us better understand those languages. This is a critical part of the process of building speech technology, and is necessary to creating products like the Google Assistant."
Google adds that the audio samples these contractors listen to amount to about 0.2% of all recordings, and that user account details aren't associated with any of them.
"Reviewers are directed not to transcribe background conversations or other noises, and only to transcribe snippets that are directed to Google," Monsees said.
0.2% -- is that it?
Google's blog post specifically addresses audio that reviewers are listening to for the purpose of helping Google Assistant master a variety of languages, dialects and accents. But are there any other purposes for which Google or its contractors listen to user audio?
I asked a Google spokesperson that exact question, but did not receive an answer. Instead, the company reiterated that language experts review around 0.2 percent of all audio snippets. It did not address whether or not Google has any other purposes for listening to user audio outside of what's described in Monsees' blog post -- details Google only shared after one of those language experts provided Belgium-based VRT NWS with more than a thousand recordings of people using Google Home smart speakers and the Google Assistant app.
"We restrict access to personal information to Google employees, contractors, and agents who need that information in order to process it. Anyone with this access is subject to strict contractual confidentiality obligations and may be disciplined or terminated if they fail to meet these obligations."
As for Amazon, the Alexa FAQ page reads:
"...we use your requests to Alexa to train our speech recognition and natural language understanding systems. The more data we use to train these systems, the better Alexa works, and training Alexa with voice recordings from a diverse range of customers helps ensure Alexa works well for everyone."
That said, Amazon claims that the actual percentage of audio recordings the company listens to and transcribes is very small, and similar to what Google pegs it at.
"We annotate a fraction of one percent of interactions from a random set of customers to improve the Alexa experience for customers," the spokesperson tells me.
As with Google, I also asked if there were any other instances outside of these where Amazon employees would listen to a user's audio recordings. Amazon's answer: "No."
What about third parties? Is my voice data being shared?
Good question. Let's start with Google.
The company has a multitude of different posts that talk about its approach to privacy for various Google services, and there's a lot to mine through in order to find clear answers. In some cases, the text is confusing.
One instance occurs on a page for Google Nest services outlining the company's commitment to privacy -- a separate page from the Google or Google Assistant privacy policies. Google explains that the guide is there "to explain as clearly and simply as we can both how our connected home devices and services work, and also how we'll uphold our commitment to respect your privacy."
A few paragraphs later, the page reads:
"...we commit to you that for all our connected home devices and services, we will keep your video footage, audio recordings, and home environment sensor readings separate from advertising, and we won't use this data for ad personalization. When you interact with your Assistant, we may use those interactions to inform your interests for ad personalization."
Read back to back, those sentences seem to contradict each other. Google won't use audio recordings for ad personalization, but when you use the Assistant, Google may use those interactions "to inform your interests for ad personalization." So which is it? Does using the Google Assistant impact the ads you see or doesn't it?
"We don't share information that personally identifies you with advertisers, such as your name or email, unless you ask us to. For example, if you see an ad for a nearby flower shop and select the 'tap to call' button, we'll connect your call and may share your phone number with the flower shop."
What does that mean for Google Assistant audio recordings, though? If I ask where the nearest flower shop is, am I going to be added to an anonymized list of people who might be interested in buying flowers? Will that list ever be shared with a marketing company for online bouquet deliveries that would then market to me?
"While we may use your interactions to inform your interests for ads personalization, this scenario would not happen," Google tells me. "A third party could not send you a coupon based on your interaction with the Assistant."
"We do not sell your personal information to anyone," the company adds. "This includes your Assistant queries or interests derived from those queries with advertisers."
"User control is very important to us," says Google, "you can always review your Google settings to control the ads you see, including opting out of ad personalization completely."
What about Amazon?
"No audio recordings are shared with third parties," an Amazon spokesperson tells me. "If you use a third party service through Alexa, we will exchange related information with that third party so they may provide the service. For example, if you interact with a third party Alexa skill, we provide the content of your requests (but not the voice recordings) to the skill so the skill can respond accordingly."
Those are two of the most common privacy-related questions facing Alexa today. A post titled "Alexa, Echo Devices, and Your Privacy" ought to address them.
Same goes for Amazon's Alexa FAQ page. Along with not providing any of the same specifics Amazon shared with us in April about when and why contractors or employees might listen to your Alexa audio, the FAQ offers no clear answers about the kind of Alexa data Amazon might be sharing with advertisers.
The only reference to advertisements in the FAQ is the blanket statement, "We also do not sell children's personal information for advertising or other purposes," along with a link to Amazon's Children's Privacy Disclosure.
The overall Amazon privacy page doesn't make much mention of Alexa except for one reference to "Alexa internet" in a long paragraph listing the types of data Amazon collects. However, the page does describe Amazon's approach to sharing the information it collects with third parties. This includes sharing information for the purpose of promotional offers.
"Sometimes we send offers to selected groups of Amazon.com customers on behalf of other businesses. When we do this, we do not give that business your name and address," the page reads.
An Amazon spokesperson offered more of an explanation of how your Alexa usage can impact what ads you see, and what controls you have over that.
"The experience on Alexa is similar to what you'd see on the Amazon website or Amazon app," the spokesperson said. "For example, if you make a purchase via Alexa shopping, that purchase may be used to provide personalized ads, similar to what you'd see if you purchased something on the website. You can opt-out of receiving personalized ads from Amazon at any time."
Should I chuck these things out the window?
That seems excessive. I don't blame anyone who doesn't want to fill their house with cameras and microphones, but I also don't blame anyone who's willing to trade some of their data with a company they feel comfortable with in order to bring some new convenience and utility into their lives. It's nearly impossible to navigate today's age without making trades like that on a daily basis.
In the meantime, I think the correct way to think about this is to assume that anything you say to your digital assistant might very well be heard by someone else in the future. After all, these companies are collecting and retaining voice recordings and transcripts,. That's not for your benefit, it's for theirs.
The real question with all of this is whether or not your privacy is being harmed. Personally, I don't have a problem with an Amazon or Google employee or contractor listening to an anonymized recording of me saying "turn off the dining room" to try and figure out why the assistant thought I said "turn off the dynamo." It's similar to the way an employee at Sony might review my PlayStation usage after a game crashes to figure out what went wrong and help prevent it from happening again.
The difference is that when my video game crashes, my PS4 asks for my permission to take a look at the crash report. Amazon and Google would argue that they do that, too -- but it's a blanket permission that users blindly agree to when they accept the sprawling user agreements during initial device setup. In today's age, I'd argue that's not good enough. At a minimum, clearer language in the app during setup about when, why and how other humans might eventually need to listen to your audio would likely help a lot of users feel better about tapping "accept."
As for data sharing, companies like Amazon and Google also ought to do a better job of describing their practices -- not just in dense legalese buried deep within one of several different privacy statements, but in straightforward, easy-to-find terms that people can actually understand. Perhaps they're worried that doing so might scare potential users away from their platforms. If that's the case, then maybe that wake-up call was long overdue.