There are plenty of things in my house that I yell at. Some of them answer back these days, though, and even do what I ask. My dog is still a work in progress as far as that goes, but my Amazon Echo has just about nailed it. The Echo is a device that uses speech recognition to perform an ever-growing range of tasks on command. Amazon calls the built-in brains of this device "Alexa," and she* is the thing that makes it work.
Alexa is a smart cookie: if I say "Alexa, play some Pink Floyd", she will find some Floyd and start playing it over the built-in speaker of the Echo. If I say "Alexa, what's the weather?" she will calmly tell me that it is too damn hot in Boston. How does she do this? The answer is that Alexa is a bit of a cheat: take the Echo apart and you'll find little more than a few speakers, microphones and a small computer. That isn't enough to do all of the clever stuff that she can do. Her real smarts are on the Internet, in the cloud-computing service run by Amazon.
The small computer in the Echo isn't completely dumb. It has enough built-in smarts to do a number of tasks, like playing back music and making lights blink. It can also recognize the Alexa name: when you say the word "Alexa", it recognizes the word (Amazon calls this the wake word) and starts recording your voice. When you have finished speaking, it sends this recording over the Internet to Amazon.
The service that processes this recording is called Alexa Voice Services (AVS). Run by Amazon, this converts the recording into commands that it interprets. It's more than a simple voice-to-text service: it is a fully programmable service that can work with other online services to do a surprising range of things. Once authorized by Amazon, anyone can use this service for free to build a home-made Echo: Amazon offers sample code for building one using a Raspberry Pi, a simple $30 computer.
It might sound extremely selfless of Amazon to provide this service for free, but as always, they have their reasons: Amazon wants others to build this service into their products so they can sell you stuff. Every product that has Alexa built in is a device that can be used to buy stuff from Amazon.
These commands that Alexa interprets can be very simple: if you ask for the time, the AVS sends back an audio file of Alexa telling you the time, which the Echo plays back. They can also be more complex: if you ask Alexa to play Pink Floyd, the AVS will search the music service you have set up for Pink Floyd, then send a command back to the Echo that sets it playing the requested music.
Alexa can also work with other technologies in your home and beyond. If you have set up any Philips Hue smart light bulbs, for instance, it can control these. Ask Alexa to turn on the living room lights, and Alexa will send a command to the Echo that sends a command to these light bulbs to turn on. It can also work with online services. Link Alexa to Uber, for instance, and you can request an Uber by simply asking Alexa. Link it to Domino's, and you can order a pizza with your voice.
This approach means that the Echo and Alexa can do a lot of things, and the list is getting longer: Amazon is adding more features (called skills) to Alexa, and a smart programmer can build their own. This means that you can use Alexa to control things that aren't on the supported list, and hackers have been hard at work doing things like adding support for controlling the media-center program Kodi and figuring out when the next bus will be arriving at your local bus stop.
However, this approach is also the Achilles' heel of Alexa: it needs internet connectivity and AVS to work. If your internet connection is slow or isn't working, Alexa won't be available, and your Echo will be useless. If Amazon decides to charge for the service (or just close it down), you'll be left with a useless device.
Amazon isn't the only game in town for this sort of thing: Google, Apple and Microsoft offer services that can perform tasks by voice command in the form of. These use the same approach: voice commands that are processed in a cloud service (although the specifics vary), but most aren't as flexible or as integrated with other services as Alexa is.
Whichever one of these services ends up being the one we all use, hopefully, they will be as polite as Alexa. When I asked her how she works, she replied, "Lots of people have worked hard to teach me, and I'm still learning more." Wouldn't it be nice if all of our appliances were that modest and polite?
*I struggled with the best pronoun to use for Alexa: "it" sounds rather impersonal, but I am also not sure if "she" is appropriate for a disembodied voice from a computer network. However, it seemed like the best one to use. When I asked her, she responded, "I am female in character." Fair enough.