X

IBM dives deeper into corporate search

Big Blue is set to release software designed to help people find connections between business documents.

Elinor Mills Former Staff Writer
Elinor Mills covers Internet security and privacy. She joined CNET News in 2005 after working as a foreign correspondent for Reuters in Portugal and writing for The Industry Standard, the IDG News Service and the Associated Press.
Elinor Mills
2 min read
IBM is promoting a new standard to allow interoperability between software that helps corporations search for and analyze unstructured data across their corporate networks, including e-mails, Word documents and anything that is not formatted in columns and rows.

The company was set to release on Monday a new version of its WebSphere Information Integration OmniFind Edition corporate information management tool. It integrates technology called Unstructured Information Management Architecture (UIMA) that IBM designed to improve the processing of text within documents and other unstructured content sources to help find relationships and meaning beyond just keywords.

IBM, a longtime supporter of the open-source movement in which developers freely write and modify software and share code, also is presenting UIMA to the Open Source Technology Group, a network of online technology resources. The updated software tool is available from IBM now and is expected to be available through the SourceForge developers Web site by the end of the year.

"IBM has been investing in a huge initiative since 2001 in information integration to help companies integrate and find any information that exists across the enterprise," said Nelson Mattos, IBM's vice president of Information Integration.

"That's the number one problem in the enterprise world," he said, adding that studies show that workers spend on average 30 percent of their time looking for relevant information. The problem is exacerbated by the fact that about 85 percent of corporate data is unstructured and thus not easy to find, Mattos said.

More than 15 companies already have said they plan to support UIMA as a framework for search and text analysis of unstructured data, IBM said.

Projects currently using IBM's WebSphere Information Integration OmniFind include a quality-control early-warning system for the automotive industry to process warranty claims, repair requests and call-center logs that can help identify problems, and an advanced intelligence system for antiterrorism and law enforcement.

"There are lots of different ways to skin a cat when it comes to analyzing unstructured text, but all those ways only give you a sneak peak at what you might get," said Dana Gardner, an analyst at Interarbor Solutions. By using UIMA, companies get a more comprehensive extraction of the information they seek, he said.

"It probably will take some time for the various commercial products to put this software developers' kit to use and allow for their products to take part in the interoperability process," Gardner added.