Google dominates in machine translation tests

The search giant beats out technology from IBM and academia in tests translating Arabic and Chinese articles to English.

Michael Kanellos Staff Writer, CNET News.com

Michael Kanellos is editor at large at CNET News.com, where he covers hardware, research and development, start-ups and the tech industry overseas.

See full bio

Michael Kanellos

Aug. 23, 2005 10:58 a.m. PT

2 min read

Search giant Google's ambitions to make the Web more international has gotten a slight boost from a U.S. government-run test in which its translation software beat out technology from IBM and academia.

Google scored the highest in Arabic-to-English and Chinese-to-English translation tests conducted by the National Institute of Science and Technology. Each test consisted of translating 100 articles from Agence France Presse and the Xinhua News Agency dated from Dec. 1, 2004, to Jan. 24, 2005. The results were posted earlier this month.

Although computerized translations historically have read more like broken English, increased processing power and larger data samples have allowed researchers to improve the accuracy of these systems.

Start-up Language Weaver, for instance, has created software that can translate Al Jazeera broadcasts. Research on the topic is also being tackled at Carnegie Mellon's Language Technology Institute and other universities. (Neither Language Weaver nor Carnegie Mellon took part in the recent test.)

Google's machine translation wasn't perfect, but it was well ahead of the competition. On a scale from zero to one, the company's software scored 0.5137 on the Arabic tests and 0.3531 on the Chinese tests. The University of Southern California's Information Sciences Institute came in second with a 0.4657 on Arabic tests and 0.3073 on Chinese. IBM scored 0.4646 on Arabic and 0.2571 on Chinese.

Other participants included the University of Edinburgh and the Harbin Institute of Technology. Most of the software tested came from research labs, according to the National Institute of Science and Technology.

Google likely benefited from its huge store of source material. Generally speaking, translation software improves as more data gets fed to it. Through its search operations, Google has amassed billions of translated Web pages.

Like Yahoo and others, Google is looking toward the developing world for new customers. The company includes some machine translation tools on its site, as well as several international editions.

Google declined to comment. (Google representatives have instituted a policy of not talking with CNET News.com reporters until July 2006 in response to privacy issues raised by a previous story.)