CNET también está disponible en español.

Ir a español

Don't show this again


Voice apps sounding better to big firms, says IBM

Tech improvements and potential cost savings are encouraging corporations to embed speech activation in their customer help desks and other services, according to Big Blue.

SAN FRANCISCO--The moment for speech has finally arrived, according to IBM, thanks to technological improvements and the capacity of voice applications to cut costs.

Prudential Securities, Honda and other large IBM customers have begun to install software modules that let customers navigate customer help desks, personal account Web pages and other services through voice commands. More companies will follow their lead, because the benefits are becoming more clear, Gary Cohen, the general manager of IBM's pervasive computing group, said on Friday.

"Speech is at a point where the technology is beginning to tip," he said. "We are shifting from looking at speech as a technology to something embedded" in enterprise applications, Cohen told conference attendees at SpeechTek 2004, which takes place here this week.

The benefits of speech applications for companies largely revolve around reduced costs and customer satisfaction. Voice navigation of automated customer systems is generally faster than keypad navigation, which reduces the time spent on a 1-800 line, for example, according to Cohen. Over millions of calls, those spare seconds add up.

Customers also appear to be more pleased with voice navigation systems. Cell phone giant Nextel Communications has seen a 13 percent drop in customer churn since it switched to a voice-navigated help desk, according to Cohen. Investment management firm T. Rowe Price Group has seen a 10 percent increase in resolved calls to its automated help desk, since switching from push button navigation, he added.

The growing acceptance of speech applications stems from a major shift in the software industry that began a few years ago. For years, IBM and others promoted speech-recognition and -activation software as an alternative to keyboard-based tools. They discovered, however, that few people wanted to dictate letters and memos like a modern-day Demosthenes.

Coming up with speech-to-text software that could accurately capture the meaning of human speech also proved difficult. As a result, development work shifted to incorporating speech into applications for which keyboards aren't convenient or voice commands are restricted to short oral blasts. Microsoft and Nuance Communications are pursuing similar efforts.

The results appear to be paying off. Dennis Marine, a vice president of information services at Prudential who took part in Cohen's presentation, noted that only about 30 percent of its customers actually get full resolution of their questions, if they use its push button-navigated phone service. In other words, 70 percent hang up.

By contrast, more than 70 percent of customers get full resolution on Prudential's voice-activated navigation system, which has been installed to manage 401K accounts. The 401K line has always had a much higher call completion rate, but switching to voice is improving that high number. The company is now expanding voice navigation to its individual insurance help desks and to its internal employee portals.

Tony Scott, the chief technology officer at General Motors' information systems and services group, said in a separate interview earlier this week that the carmaker is seeing increasing customer acceptance of its OnStar in-car help system. GM, which signed a deal to switch to IBM's speech technology recently, expects to expand its OnStar line-up.

To facilitate this growth, IBM will continue to add modules to its WebSphere lineup and to hone research. Some of IBM's current research projects include free-form dialog support and expressive output, which allow a computer to pick up emotional or contextual cues in a person's speech, as well as to deliver messages with more urgency and in ordinary language.

IBM is also researching voice biometrics, technology to enable people to authenticate their identity through voice prints.