The Unstructured Information Management Architecture (UIMA) is an XML-based data retrieval architecture under development at IBM. UIMA will greatly expand and enhance the retrieval techniques underlying databases, said Alfred Spector, vice president of services and software at IBM's Research division.
UIMA "is something that becomes part of a database, or, more likely, something that databases access," he said. "You can sense things almost all the time. You can effect change in automated or human systems much more."
Once incorporated into systems, UIMA could allow cars to obtain and display real-time data on traffic conditions and on average auto speeds on freeways, or it could let factories regulate their own fuel consumption and optimally schedule activities. Automated language translation and natural language processing also would become feasible.
The theory underlying UIMA is the Combination Hypothesis, which states that statistical machine learning--the sort of data-ranking intelligence behind search site Google--syntactical artificial intelligence, and other techniques can be married in the relatively near future.
"If we apply in parallel the techniques that different artificial intelligence schools have been proponents of, we will achieve a multiplicative reduction in error rates," Spector said. "We're beginning to apply the Combination Hypothesis, and that is going to happen a lot this year. I think you will begin to see this rolling out in technologies that people use over the next few years. It isn't that far away.
"There is more progress in this happening than has happened, despite the fact that the Nasdaq is off its peak," he added.
The results of current, major UIMA experiments will be disclosed to analysts around March, with public disclosures to follow, sources at IBM said.
Although it's been alternately touted and debunked, the era of functionalmay be dawning. For one thing, the processing power and data-storage capabilities required for thinking machines are now coming into existence.
Researchers also have refined more acutely the algorithms and concepts behind artificially intelligent software.
Additionally, the explosive growth of the Internet has created a need for machines that can function relatively autonomously. In the future, both businesses and individuals simply will own far more computers than they can manage--spitting out more data than people will be able to mentally absorb on their own. The types of data on the Net--audio, text, visual--will also continue to grow.
, meanwhile, provides an easy way to share and classify data, which makes it easier to apply intelligence technology into the computing environment. "The database industry will undergo more change in the next three years than it has in the last 20 due to the emergence of XML," Spector said.
A new order
Artificial intelligence in a sense will function like a filter. Sensors will gather data from the outside world and send it to a computer, which in turn will issue the appropriate actions, alerting its human owners only when necessary.
IBM's approach to artificial intelligence has been decidedly agnostic. There are roughly two basic schools of thought in artificial intelligence. Statistical learning advocates believe that the best guide for thinking machines is memory.
Based in part on the mathematical theories of 18th century clergyman Thomas, statistical theory essentially states that the future, or current events, can be identified by what occurred in the past. Google search results, for example, are laundry lists of sites other individuals examined after posing similar queries ranked in a hierarchy. Voice-recognition applications work under the same principle.
By contrast, rules-based intelligence advocates, broken down into syntactical and grammatical schools of thought, believe that machines work better when more aware of context.
A search for "Italian Pet Rock" on a statistically intelligent search engine, for example, might return sites about the 1970s novelty. A rules-based application, by contrast, might realize you mistyped the Italian poet Petrarch. A Google search on UIMA turned up the Ukrainian Institute of Modern Art as the first selection.
"The combination of grammatical, statistical, advanced statistical (and) semantics will probably be needed to do this, but you can't do it without a common architecture," Spector said. Thinking in humans, after all, isn't completely understood.
"It's not exactly clear how children learn. I'm convinced it's statistically initially, but then at a certain point you will see...it is not just statistical," he said. "They are reasoning. It's remarkable."