Google finds perks in its Wikipedia translations
It's nice for Wikipedia fans that Google helps fund and foster translation work. It's also nice for Google's own translation technology.
Google's mission is to organize the world's information and make it universally accessible, but not necessarily to create it outright. This makes Wikipedia a natural partner.
It's therefore no surprise to hear when the search colossus helps out the cooperatively written project.
Specifically, Google is helping Wikipedia with translation, so subject matter documented in one language needn't be created from scratch in another. Google described some of its translation work in a presentation at the Wikimania conference in Poland over the weekend.
"In the last 16 months, Google has been working with the Wikimedia Foundation, students, professors, Google volunteers, paid translators, and members of the Wikipedia community to increase Wikipedia content in Arabic, Indic languages, and Swahili," Google said at the conference. In a blog post on Wednesday, Google said it has begun the work with Hindi, which despite having millions of speakers had only 21,000 Wikipedia articles in 2008 compared with 2.5 million in English.
All of this is a laudable goal, given how often Wikipedia entries show up in Google search results. But there's an interesting, financially helpful side effect of the work, too: it's perfectly suited to improving Google's own translation tools.
That's because Google's translation technology begins with content in which the same text appears in multiple languages. The more examples of human translation it has, the better it works and the less often it has to fall back on machine translation. Wikipedia provides a diverse and growing body of subject matter that seems ideal for the task.
Google helps others help itself with the Google Translator Toolkit, which lets people collaboratively translate documents with Google's technology offering a head start.
The Translator Toolkit can specifically import Wikipedia pages, and doing so contributes a project to Google's translation technology. "Translated segments for Wikipedia articles are stored in our global, shared translation memory. You cannot change this setting for Wikipedia translations," the tool notes.
"There are many Internet users who have used our tools to translate more than 100 million words of Wikipedia content into various languages worldwide," Google product manager Michael Galvez said in the blog post.
Having better translation directly helps Google by lowering language barriers for its sites--not just supplying search results, but indexing Web sites, captioning YouTube videos, translating e-mail, and translating Web pages viewed in Chrome.