IBM unveils toolkit for talking computers
Big Blue's new software toolkit helps developers build speech-recognition and other "multimodal" applications for Linux computers.
The Multimodal Toolkit for WebSphere Studio helps developers create applications that can use more than one mode of communication. An example of a multimodal application is software on a personal organizer that can understand voice commands ("I need Mr. X's e-mail") and respond with a text message.
Although speech-recognition software has yet to become a mass-market phenomenon, interest in it is being spurred by technological breakthroughs and the growth of computing devices that don't have room for full-size keyboards.
IBM, for instance, has begun to create sophisticated help-desk systems for companies such as T. Rowe Price that let telephone callers get information out of databases with ordinary questions. Microsoft is readying Speech Server, a server application that will perform similar functions for small and medium-size businesses.
Hewlett-Packard announced Wednesday it was buying PipeBeach, which makes similar software.
IBM's toolkit is based on the XHTML+Voice specification, a combination of XHTML and Voice XML that also is known as X+V. The kit includes a multimodal editor, reusable blocks of X+V code and a simulator based on Opera 7 for Windows.
The WebSphere Everyplace Multimodal Environment for Embedix, which also comes with the kit, eases the process of developing a user interface for set-top boxes or handhelds.