Semantic technology gains publishing foothold
Technology that lets computers better comprehend the meaning of digital content is gaining some adoption among Web publishers.
OpenCalais, a Thomson Reuters project to improve electronic publishing by adding computer-readable labels to content, has attracted the attention of several media publishing organizations, including CNET.
The OpenCalais product, available in a free or a more sophisticated paid form, adds labels to content through a technology called semantic analysis. By adding descriptive labels, computers at least theoretically can understand what they're processing beyond just the raw text in a news story or photo caption, for example by recognizing addresses or names.
CNET, publisher of CNET News, is using OpenCalais' service to augment its product reviews and news, the companies plan to announce Thursday. CNET will use the technology to improve features such as searching, spotlighting content related to what a reader is viewing, and enabling programmatic use of its content over the Web.
Others using the technology include The HuffingtonPost and DailyMe, two other online news sites. DailyMe automatically sends its content through OpenCalais' servers, which labels the content with categories such as people, medical conditions, or companies and with specific elements of those categories, said Neil Budde, president and chief product officer.
"It allows us to build picture of news user's behavior to implicitly personalize the site for them," Budde said, adding that automated personalization features are scheduled to arrive in about a month. The company plans to license its service to other news sites, he added, and improve advertising targeting based on the same personalization information.
A closely related technology, the semantic Web, in which elements of Web pages are labeled with computer-readable coding to help computers better understand the meaning of the content, has been around for years. It's only now beginning to gain adoption as a real-world technology because of two big reasons, though: Yahoo and Google.
A year ago,and could spruce up those pages' appearance in search results through Yahoo's SearchMonkey technology. Then, in May, with both indexing and display of pages in search results. OpenCalais, however, offers technology that creates online content that search engines discover through conventional means of analyzing text.
The tagging seems to help search engines find the company's content and spotlight it in search results, Budde said. "We create a lot of topics pages on the fly based on entities that come in from Calais, and those get pretty good pickup through search engines definitely," he said.
Paul Perry, The Huffington Post's chief technology officer, has begun using OpenCalais' service in the company's publishing system. When a story mentions a specific location or company, for example, OpenCalais' service suggests to editors the ability to associate the story with a specific geographic location or to add a specific company's stock ticker, Perry said.
That explicit labeling makes it easier for local editors--Chicago so far is the only city with localized Huffington Post news, though more areas will arrive this summer--to spot geographically relevant information, he said. "For us, local is super important. We're doing a ton of work for it," he said.
Semantic technology fans will convene starting June 14 for the Semantic Technology Conference in San Jose, Calif., at which Thomson Reuters' Tom Tague is scheduled to deliver a keynote speech.