What IBM's Watson tells us about the state of AI

If you're an AI researcher, the "Jeopardy"-playing Watson supercomputer--the first episode airs tonight--is either pretty impressive or something of a parlor trick.

Computers that reliably understand human communications have been a staple of fiction going back decades or more. The Enterprise's computer in the 1960s vintage "Star Trek" series is as good an example as any. And truth is, that particular science-fictional ability probably would not have seemed all that remarkable to the typical person of the time.

Access billions of pages of text, pictures, and video from a gadget I can fit in my pocket? Play a game with immersive graphics on a huge, high-resolution screen that hangs on the wall? For a computer engineer, the fact that those inexpensive consumer devices have more computing power than all the then-computers in the world would impress as well. But understanding speech? That's something a toddler can do.

IBM Watson's challenge on 'Jeopardy' will be broadcast starting tonight. IBM

But understanding speech has turned out to be really difficult. In fact, just converting speech to text has been a huge challenge. Indeed, when IBM Watson takes on past "Jeopardy" champions in a contest televised beginning tonight, the questions will be fed to it as text, rather than speech. But answering the often convoluted questions used on "Jeopardy" is hard enough even without processing the spoken word.

Although this contest takes place in the artificial setting of a game show, it does give us a glimpse into what is possible and what is not with artificial intelligence, that is AI, today. And perhaps where AI is going.

AI research is generally considered to have launched in 1956 at the Dartmouth Summer Research Conference on Artificial Intelligence. The hope of many researchers at that time was that they would be able to create a so-called "strong AI" over the next few decades--which is to say an AI that could reason, learn, plan, and communicate. Research in this vein has produced very limited results. One of the big problems has been the almost equal lack of progress in understanding how humans think. Thus, the failure of strong AI may well be related to the equal lack of progress in significant areas of cognitive psychology.

Some of the AI pioneers still have a more optimistic view. MIT's Marvin Minsky places the blame more on a shift away from fundamental research. As he puts it, "The great laboratories somehow disappeared, economies became tighter, and companies had to make a profit--they couldn't start projects that would take 10 years to pay off."

So Watson is in no real sense thinking and the use of the term "understanding" in the context of Watson should be taken as anthropomorphism rather than a literal description.

Is Watson just about brute force then? One might think so. Its hardware specs are impressive:

IBM Watson is comprised of ninety IBM POWER 750 servers, 16 Terabytes of memory, and 4 Terabytes of clustered storage. This is enclosed in ten racks including the servers, networking, shared disk system, and cluster controllers. These ninety POWER 750 servers have four POWER7 processors, each with eight cores. IBM Watson has a total of 2880 POWER7 cores.

To put this in perspective, by my estimate, Watson would have been the fastest supercomputer in the world on the TOP500 list just five years ago. And, although the disk and memory specs aren't nearly so impressive, remember that we're just talking about text-based data here. In fact, it's loaded with millions of documents--making the fact that it, like the human contestants, isn't hooked up to the Internet something of a red herring.

Chris Anderson, the editor in chief of Wired, argues that data often replaces underlying theory. He goes on to quote Peter Norvig, Google's research director: "All models are wrong, and increasingly you can succeed without them."

But thinking of Watson as just a big, fast computer that just points to Wikipedia or the Oxford English dictionary and the right answer pops out understates the complexity of the natural language processing that has to go on. If Jeopardy consisted solely of grade-school type questions--excuse me, answers--like "the 42nd president of the United States," this would in fact be a relatively simple exercise. But many Jeopardy questions consist of wordplay, riddles, and other barriers to literal lookup of answers.

Watson is part of IBM's DeepQA project. The QA stands for question answering. As IBM researchers put it:

the open-domain QA problem is attractive as it is one of the most challenging in the realm of computer science and artificial intelligence, requiring a synthesis of information retrieval, natural language processing, knowledge representation and reasoning, machine learning, and computer-human interfaces.

In association with Carnegie Mellon University, IBM created the Open Advancement of Question Answering (OAQA) initiative "to provide a foundation for effective collaboration among researchers to accelerate the science of automatic question answering." Among other things, this initiative is intended to enable adapting Watson's software to new data domains and problem types.

Although Watson is certainly a powerful computer loaded with lots of data, as described in this PBS video, the software is very much a key ingredient here; many new algorithms and approaches were needed to make Watson competitive with strong human players. For example, to learn from examples, to understand how context affects the significance of names and places, and to correlate multiple facts in a particular answer.

Strong AI proponents may well view something like Watson as something of a parlor trick in that it doesn't really try to reason as a human does. But, that said, there's much more--dare we say intelligence--involved here than there is in playing chess, a well-bounded and formalized problem. And given the longtime difficulty of understanding real intelligence, this is the AI path that seems to hold the most promise for now.