Should Siri be jealous of voice recognition competitors?

While Siri isn't the first voice-command service, it certainly is the most popular. A lot of rivals, however, have raised their games at CES.

Roger Cheng Former Executive Editor / Head of News
Roger Cheng (he/him/his) was the executive editor in charge of CNET News, managing everything from daily breaking news to in-depth investigative packages. Prior to this, he was on the telecommunications beat and wrote for Dow Jones Newswires and The Wall Street Journal for nearly a decade and got his start writing and laying out pages at a local paper in Southern California. He's a devoted Trojan alum and thinks sleep is the perfect -- if unattainable -- hobby for a parent.
Expertise Mobile, 5G, Big Tech, Social Media Credentials
  • SABEW Best in Business 2011 Award for Breaking News Coverage, Eddie Award in 2020 for 5G coverage, runner-up National Arts & Entertainment Journalism Award for culture analysis.
Roger Cheng
4 min read
Voice command services such as this one from Vlingo are starting to get more popular thanks to Siri. So what's next? Vlingo

LAS VEGAS--Looks like Siri was just the beginning.

Okay, even Siri wasn't the beginning. The ability to do voice-command isn't particularly new, but the marquee feature for Apple's iPhone 4S has gotten the masses to recognize and appreciate its benefits. For the first time, voice-command was a feature people talked about and coveted.

At CES, there were better implementations and voice-commands popping up on different devices. Big-name companies got into the mix. Dieter Zetsche, head of Mercedes Benz, said voice would play a major role in its cars, calling them a driver's "digital companion." Ultrabooks will eventually be getting speech recognition built in. Manufacturers from Samsung Electronics to Lenovo are integrating the feature into their high-end televisions.

Indeed, using speech to control a TV was a major trend of the show. Vlingo, which makes a virtual speech assistant for smartphones, announced its "Vlingo for Smarter TVs" software, which it plans to embed into televisions and set-top box. Nuance likewise announced its Dragon TV platform, which is believed to be powering the new voice-and-gesture-controlled Samsung TVs.

But this is just the beginning. The voice-recognition companies are looking to get the feature in every electronic device. They also want to get to the point where these virtual assistants follow you from device to device in a consistent manner, so your preferences move where you move.

"Everything you see in 'Star Trek'--it's going to be real," said Matt Revis, vice president of product management for Nuance's mobile division.

For that to happen, and for consumers to truly gravitate to speech, there still needs to be more education out in the market.

"For two and a half years, we've shipped various products that were equivalent or better than Siri, but a start-up is hard to make a market," said Vlingo CEO Dave Grannan. "People really don't know what they need, they need to be shown."

Nuance has agreed to acquire Vlingo in a deal that is expected to close later this year.

Natural dialogue the key
While voice commands have long been used by various industries, including automated help lines and older cell phones, Siri brought attention to an advancement of voice-recognition: the ability to understand our natural language and respond in kind, so Siri really does seem like a person. That's managed to turn it from a utilitarian tool to something you want to use.

"Speech initially was used because it was more convenient," Revis said. "But Siri is fun. You're engaged."

Over the past few years, there have been huge advancements in voice recognition. Vlingo, for instance, has done a lot of work in improving the artificial intelligence and natural understanding, Grannan said. You can say, "I want to get wasted" and have its program find you local bars, he added.

Vlingo's TV service allows for voice commands to work like a dialogue. Make a request, and it will ask a question back to narrow down your choices. There's a back and forth that continues until the user finds what he is looking for.

The company is working with one smart-TV manufacturer, and one European cable provider, Grannan said, adding he expects products to come out by the end of 2012.

The various companies say that given the intense processing power required for AI, basing the services on the cloud is key. While the microphone picks up your comment on your phone, car, or TV, much of the heavy lifting occurs on the back-end at Nuance, Vlingo, or Apple's servers.

Beyond just a remote control
That cloud will enable these companies to put voice recognition on just about any connected device.

Revis said he envisions every electronic device being able to run these features, and it appears we're pretty close to it. Samsung, for instance, showed off several connected appliances including refrigerators and washers that could easily integrate a Siri-like capability down the line. Likewise, it has opened up its smart TVs to developers in the hopes they better take advantage of the gesture and voice controls.

"There are a lot of user scenarios you can dream up," said Joe Stinziano, senior VP of home entertainment marketing for Samsung. "We're opening it up to a developer community that has proven it can be more creative than the original manufacturer."

One such scenario involves asking the refrigerator about milk, and having the appliance detect via barcode scanner whether there is any left, or to set a reminder about buying more milk later on, Grannan said.

Large tech companies such as Sony, Samsung, or Apple are at an advantage because they make so many of the products that consumers use. The large Asian conglomerates, for example, making smartphones, PCs, appliances, and even home heating and cooling systems, could all integrate voice.

Nuance, meanwhile, is hoping to get voice commands in more apps. The company has released a software development kit to allow other programmers to integrate its service into their own apps. Revis noted that Amazon Price Checker and Merriam-Webster Dictionary apps used its technology.

As for the further proliferation of voice commands in more products? We're probably still a little away from barking orders at our microwave.

"I think we'll see a couple of iterations and won't see mainstream until the end of 2013," Grannan said.