Microsoft's new translation tech speaks Chinese -- in your own voice

Not only that, but it's more accurate as well -- and it's only going to get better, says Microsoft's research chief, Rick Rashid.

Don Reisinger
CNET contributor Don Reisinger is a technology columnist who has covered everything from HDTVs to computers to Flowbee Haircut Systems. Besides his work with CNET, Don's work has been featured in a variety of other publications including PC World and a host of Ziff-Davis publications.
Don Reisinger
2 min read
Microsoft Research's Rick Rashid.
Microsoft Research's Rick Rashid. Ina Fried/CBS Interactive

Microsoft has a new translation technology that increases accuracy with help from your voice.

Discussed yesterday in a blog post by Microsoft chief research officer Rick Rashid, the company's translation technology is capable of taking a user's spoken English word and then translating that into Mandarin Chinese. The kicker is that the Chinese translation is pumped through speakers in the user's own voice.

Microsoft's technology is based on a new translation technique called Deep Neural Networks (DNN). Rather than use the "hidden Markov modeling" technique, which is widely used and bases translation on training data from many speakers, DNN uses human brain behavior to develop better speech recognizers.

That advancement has helped Microsoft reduce translation error by over 30 percent, compared to the Markov method, according to Rashid. Whereas older models make errors once in every four or five words, DNN's error rate is one word out of seven or eight.

"While still far from perfect, this is the most dramatic change in accuracy since the introduction of hidden Markov modeling in 1979, and as we add more data to the training we believe that we will get even better results," Rashid wrote in the blog post.

So, how did Microsoft use Rashid's voice as the translated audio? According to Rashid, the technology requires a few hours of recorded speech by a native Chinese speaker. From there, Microsoft records about an hour of Rashid's own voice and melds the two together.

Microsoft, of course, is not the only company working on voice translation technology. Google Translate, for example, is available to detect languages and translate them into another.

Still, Microsoft believes that DNN is the future. And although he expects errors to continue, Rashid believes that with more improvements, the technology could dramatically reduce language barriers worldwide.

"The results are still not perfect, and there is still much work to be done, but the technology is very promising, and we hope that in a few years we will have systems that can completely break down language barriers," Rashid wrote.

Here's a video of the technology in action from a Microsoft Research event in China last month:

(Via ZDNet)