X

Microsoft's big thinker

Jack Breese of Microsoft Research: Figuring out where to take the company's technology in the emerging .Net era.

Charles Cooper Former Executive Editor / News
Charles Cooper was an executive editor at CNET News. He has covered technology and business for more than 25 years, working at CBSNews.com, the Associated Press, Computer & Software News, Computer Shopper, PC Week, and ZDNet.
Charles Cooper
8 min read
It's not in the official job description, but Jack Breese gets paid to play around with "pretty cool stuff" for a living.

So do some 600 like-minded folk he helps direct in his official role as assistant director of Microsoft Research. With research groups in Silicon Valley, the United Kingdom, China and Redmond, Wash., the division functions as a technological brain trust, filling the same role for Microsoft that Bell Labs did for AT&T.

In thinking about ways to create a more human-centric approach to personal computing--perhaps the Holy Grail of the computer industry--Microsoft Research is simultaneously proceeding with work on several fronts, including natural language processing, multimedia signal processing and data mining. But the clock turns slowly, and although some of the group's research has been incorporated into commercial products, a lot of it winds up on the cutting-room floor.

That's by design, according to Steve Ballmer. In a speech last month to the Association for Computing Machinery, Microsoft's chief executive joked that if ideas are getting turned into products too quickly, then Microsoft Research isn't doing its job.

Still, this is the private sector, and Breese is responsible for shepherding ideas to completion. In a recent interview, he talked about what's hot on Microsoft's research agenda as the company marks the beginning of its transition into the .Net era.

How close are we from liberation day, when people are no longer slaves of the box, when computing becomes pervasive and human-centric?
Just because (Microsoft Bob) didn't work out doesn't mean it was a bad idea in general. People treat computers like people. Certainly that's been one of our visions. We'd like to be able to interact with a computer just like you interact with another human being. And we've used that expression to combine various aspects of our research program: speech recognition, vision, computer vision, natural language processing.

Communicating as you would with another person?
You should be able to talk you to your computer and it should understand what you're talking about. You should be able to gesture to it and then it should be able to display back to you and talk to you. Our Adaptive Systems Interactions Group is looking at computers knowing much more about you, about your preferences and your task and your expertise level, so it can provide help in a more graceful manner.

What you're talking about reminds me of what Microsoft attempted to do with the release of Microsoft Bob a few years ago. You see some of that approach in Office Assistant.
Yeah, there's this same thread there in terms of social user interfaces, and Bob was an attempt in that space. Just because it didn't work out doesn't mean it was a bad idea in general. People treat computers like people. That's how they think of them; that's how we're wired up. So ultimately, systems have that kind of response from people, and you can take advantage of it: Just make the systems much more sensitive to these human qualities and do something intelligent with it.

I've seen product demos that use a person's retinal scans to operate a computer system. Some companies are developing technologies that make use of gestures and voice. What's the holdup preventing a more general rollout in everyday use? Is the problem on the hardware or the software side? Or is it a matter of not having enough processing speed or bandwidth?
I think it's mostly software. But there are different aspects of software. So let's take speech recognition, for example. In various systems, if you have a clean environment and a good microphone, speech recognition actually will work pretty well. You can talk to the computer and those words will come into the system. Now, is that doing speech recognition work or not?

That's a matter of definition.
Exactly. So the interesting thing is, can you talk to the computer and have it do something intelligent with it? We're making a lot of progress in all these areas, like in pattern recognition where, OK, the system can recognize a gesture. But then the hard part--and we're still working on this--is how you put it all together so that the system really has knowledge about the world.

How close are computer scientists?
That's where it's very hard to predict when we're going to be able to succeed.

Yet collectively, Microsoft, IBM, Xerox PARC--and the rest--you're all throwing billions of dollars into R&D every year. But as a user...
It's just not getting there.

Right. It's just not happening.
Yeah. Well, there is just so much complexity. I just think those are really hard problems, and it's not clear when we're going to crack it because these problems have been referred to in the community as being AI complete--that you get to a certain point and in order for it to really work, you have to know everything about people and society and interaction.

Are you attacking the problem at Microsoft Research from a particular angle that's doing especially well?
The big thing we'll see in five years is more ubiquitous computing...There'll be more sensors, more wireless devices and more consumer electronics that will have PC capability. Well, I think we're doing a lot of the same things you see other people doing at other research labs. We actually have a project that specifically addresses this kind of core intelligence. We haven't put a huge amount of resources into it, but it's within our Natural Language Processing group. You may be familiar with that group because we've done things like the grammar checker and things like that. But again, this is not getting us to a real fundamental breakthrough.

Which would be?
The fundamental breakthrough would be to really build computer understanding of the world. And along that path, we have a system that can take a dictionary and read it and begin to understand what that dictionary says so that it will understand the different senses of the word "bird"--that it could be a satellite or it could be a feathered animal...

Or a basketball player from Boston.
Right. Those are different senses of the term. It's been implemented and we're using it for machine translation right now. That's how we're driving it right now.

Five years from now, what's it going to look like?
That's always a tough one. We're usually wrong about those questions, the answers to those questions.

Come on. Give me something to write about.
What I have found in my career is that the intelligence features are not as far along as we would have expected them to be. Probably 10 years ago, we would have said, "Well, in five years, speech recognition will be ubiquitous." It hasn't happened. My guess is that the big thing we'll see in five years is more ubiquitous computing. It will be around us more: There'll be more sensors, more wireless devices and more consumer electronics that will have PC capability.

Wearable implements, things like that.
Wearables and better sensors. I don't know if we'll really crack the intelligence thing in five years, but I think when you have that kind of infrastructure and things talking to one another, there's new behaviors, new applications that we can't really imagine. But they'll be cool things.

What's at the top of your agenda as far as bringing some of these new approaches and technologies to the Microsoft product lineup?
We like to do work on basic fundamental technologies that we can embed into kind of a platform product or the operating system and then make that available to additional developers. An example of this recently was putting data-mining technology into SQL Server so that there would be world-class clustering and prediction technologies within SQL Server. Then developers, database people in enterprises, small shops and (independent software vendors) take advantage of that and apply it in different ways.

Our highest goal is to provide that basic platform capability.

At the operating system level?
It could be in providing multimedia capabilities for operating systems. That's what our Signal Processing group is really focused on. Imagine a global translation service as a .Net type of service where you can basically put text into any language and get it out in any other language.

But can't you do that now, and isn't it a crapshoot whether or not it'll translate correctly?
Right, but we want it to really work. I'm fairly confident we'll be able to do better than existing technologies.

What about Windows in the future? How do you see the evolution of the OS?
I'm thinking about multimedia where audio and video images are really first-class objects in the system that you can manipulate as easily as you can with text. That you can cut, paste, edit, find, search, manage those media types as easily or more easily than you can with your documents now. I think that's a major push. The system becoming more of a communication device, so you are talking to it. Maybe it's not recognizing your speech, but it's acting as a telephone or in videoconferencing. And then there'll be special-purpose devices, consumer devices, which really have a PC behind them, kind of the WebTV or UltimateTV and those, that thread of things.

If you're going to achieve those goals, will Microsoft need to substantively increase the code base inside of Windows?
That's a product group's call whether they want to put more code in. What we try to do is create options for them. We will say, "OK, here--we have a compression format. It's faster, it does better compression, has a smaller memory footprint. You guys decide. Do you want to put it in, in addition to what you have already?" That's a business decision on how they move that format. Our business is to generate these alternatives and to line up our research agenda, such that we can provide useful things in the future for these guys.

The rap against Microsoft products and technology has always been that it takes you three times to get it right. Windows, Windows NT, Internet Explorer. Why do you think that perception has existed?
Well, some of these things are just hard. I mean, a lot of it is to do with integration. A lot of Microsoft's value is integrating things together. So, for example, getting this stuff into SQL Server. It's easy to be a little small company and say, "OK, I've got the best data-mining program in the world. I've written a paper on it and here it is and isn't this cool?" But doing the hard work to make sure that it works with the SQL language that's native to the thing, that takes awhile to get the stuff to work and perform. And there's just a different set of constraints that we're working with.

From the point of view of a technologist, is it harder or more challenging to be the size that you are?
I think it's harder. I think a big company with a lot of dependencies is harder. The payoff is huge. That's why people come to Microsoft Research. I had small companies. I was down here in the Valley and we were doing expert system and anytime stuff and it's very hard to get any traction to get anywhere and to get credibility and get anywhere.

You come to Microsoft, there's a high bar. This was a famous quote that we had: These guys came in from Windows and they said, "Yeah, yeah, yeah. I know it's got tweaky math behind it, but is anyone going to care about this? Is this going to help anybody?" And so if you can't show that to these guys, they're just going to say "Yeah" and you know. But you get it in there and it changes people's lives and has an impact--that makes things better. And that's exciting.