Microsoft hopes to make such mangled translations--this one's from a Japanese car rental brochure--a thing of the past with a new software tool that understands the rules and patterns of English.
Dubbed the English Writing Wizard (EWW), the tool is available now in the Chinese enterprise version of Microsoft'sproductivity package, which is a pan-Chinese package that supports several language groups. EWW is likely to be adapted for other of Office in the future.
Unlike services such as Babelfish, which produce clumsy machine translations of foreign text, EWW is not intended to be a substitute for a working knowledge of English, said Ming Zhou, a researcher in the Beijing office of Microsoft Research and head of the EWW project.
Instead, EWW is meant to assist people who know a fair number of English words but need help mastering the often arcane and seemingly contradictory patterns that govern how they're put together.
"It's a tool to help users select the correct words according to the context," Zhou said. "If you say 'book,' you need to know how to use it. Is it being used as a noun or a verb? Non-English users find it very difficult to use words like that correctly."
EWW works by analyzing English words in relation to each other. The software looks at the words surrounding "book," for example, to gather clues on whether the word is meant as a noun or verb. It then suggests alternative phrasing based on the analysis.
Zhou said one advantage of the software is that it helps users achieve a more natural and pleasing writing style by suggesting new English usages.
"Chinese users tend to use the same expression over and over once they've figured it out," he said. "If they know to say 'overcome difficulties,' for instance, they use it again and again, even though there are other expressions that could express that idea better. Our system will provide all expressions with similar meanings...and we provide some sample sentences."
As comprehensive as it is, EWW doesn't actually know any English rules, per se. Instead, the software recognizes patterns and probabilities for the way words go together, based on exhaustive analysis of various text sources, including 10 years' worth of the Wall Street Journal.
"We use a data-driven approach," Zhou said. "All the knowledge is learned automatically from the data. We took lots of articles from the Wall Street Journal, the New York Times and other sources, and the software learned the co-location patterns automatically."
That approach should make it relatively easy to expand EWW to other languages, Zhou said, with Japanese the next likely candidate.