X

Opera's browser finds its voice

Norway-based Opera is adding voice control to its eponymous browser, enabling users to browse the Web and fill in forms by talking to their PC.

Paul Festa Staff Writer, CNET News.com
Paul Festa
covers browser development and Web standards.
Paul Festa
3 min read
Opera is adding voice control to its browser, enabling users to browse the Web and fill in voice-enabled Web forms by talking to their PC. They can also have the contents of Web sites read back to them.

The next version of the browser is due in "a couple of months," according to Opera Chief Executive Jon von Tetzchner, which means that it's likely to debut at about the same time as Microsoft's Speech Server software, which is designed to improve the way that servers handle spoken commands.

Opera's browser, which will incorporate IBM's Embedded ViaVoice speech technology, will be available as a free download with advertisements, as well as in a for-sale version without advertising, von Tetzchner said.

Aside from the obvious accessibility benefits, von Tetzchner said, there are applications for in-car computing: "In a car, you would like a combination of screen and voice, but you don't want to be watching a screen while driving. Being able to perform tasks by voice and get voice feedback will be very useful."

Opera also rolled out the time-honored slideshow as an example of a possible application for the voice function. By combining Opera Show with voice, said the company, users will be able to give presentations and tell Opera via voice commands to turn to the next slide without having to approach the computer and press the "page down" key.

"This new offering can allow us to interact with the content on the Web in a more natural way, combining speech with other forms of input and output--first on PCs, and in the near future, in devices such as cell phones and PDAs," said Igor Jablokov, director of embedded speech at IBM and chairman of the VoiceXML Forum, an industry organization formed to create and promote the Voice Extensible Markup Language.

Developers can also start to build multimodal content with the open standards-based X+V markup language, Jablokov said, "using development skills a large population of programmers already have today."

"One part of this is about being able to control a browser by voice, but it is also about using pages with XHTML+Voice (X+V) coded into them," von Tetzchner said. X+V is a voice specification that Opera, IBM and Motorola submitted to Web standards body the World Wide Web Consortium in 2001.

X+V combines two markup languages, both based on XML. XHTML is a version of the Hypertext Markup Language expressed in XML, and VoiceXML is an XML framework for developers of voice applications.

X+V competes with a Microsoft-backed technology, also based on XML, known as Speech Application Language Tags.

Opera, which has a confrontational history with Microsoft, said X+V is the wave of the future.

"X+V provides a better relationship to XHTML and a more standardized transition to be able to integrate voice capabilities," said Christen Krogh, Opera's vice president of engineering.

Opera isn't the first to work on adding IBM's X+V technology. Access Systems, the Tokyo-based maker of the NetFront browser for mobile devices, announced in September a similar effort to add the voice capabilities.

Opera will make the IBM integrated-voice browser available in English for Windows, and the browser will initially be targeted at enterprise customers and developers.

Matt Loney of ZDNet UK reported from London. CNET News.com's Paul Festa reported from San Francisco.