Mark-up language wins praise

Though it's not yet widespread on the Net, for corporate intranets and server-based applications, XML offers immediate and compelling benefits.

Paul Festa Staff Writer, CNET News.com
Paul Festa
covers browser development and Web standards.
Paul Festa
4 min read
CHICAGO--XML appears to be a long way from widespread use on the Internet.

But for corporate intranets and server-based applications, the technology offers immediate and compelling benefits, panelists at the Internet World conference here said yesterday.

During a day-long workshop entitled "XML Xposed," conference attendees heard ringing endorsements of eXtensible markup language as a tool for distributing information from databases to intranets and also to Web servers.

Webmasters were warned that in building their sites they should be preparing for a world in which XML-based languages are dominant on the Web.

XML is a metalanguage, or mark-up language, that allows individual developers or whole industries to design specific tagging languages based on it. One such example is MathML, whose tags designate elements on the page specific to mathematics. Another is CML, or chemical markup language, with tags specific to chemistry.

A primary goal of XML is to label with tags information throughout a document to make it easier to find and retrieve. Anyone who has come up with hundreds of thousands of mostly irrelevant results on a search engine query can understand the value of such a technology to the Web.

But XML faces a number of hurdles before the average Web surfer can use it to hone his or her searches. One such hurdle is lack of browser support. Browser market leader Netscape Communications does not yet make a browser that supports XML. And while Microsoft's Internet Explorer 4.0 browser supports XML, that browser was built before the first version of XML received a final recommendation from the World Wide Web Consortium.

In addition to adequate browser support, XML still awaits a W3C recommendation for an XML-specific Document Object Model (DOM) specification. A DOM is a standard interface that lets programs and scripts access content written, in this case, in XML. Currently, developers can write their own proprietary application programming interfaces (APIs) to do this, but a standard XML DOM with the W3C stamp of approval will make XML documents more universally accessible.

The W3C will release its final working draft on the DOM in a matter of weeks, panel member and DOM working group chair Lauren Wood said yesterday. The next and final step for the W3C is to issue a DOM recommendation.

But while the Web awaits the DOM, browser support, and other more sophisticated XML-related standards having to do with style, linking, querying, and other matters, corporate databases are crying out for XML, panel members said.

"Everyone should be looking at XML," said David Turner, the XML "evangelist" with Microsoft's developer relations group. "There's no doubt that some pieces are missing for building applications, but some people and companies are already building today."

One example Turner gave was Shell Oil, which uses an XML-based database application built by RivCom to perform employee competency gap analysis.

Another example from the panel, this one hypothetical, was the use of an XML-based markup language to describe the elements of a purchase order. A commerce-specific mark-up language could be used by any vendor, saving companies the work of reinventing the wheel with every new application.

"XML could be the last general purpose protocol ever," said panel member and XMLU president Brian Travis.

Microsoft's Turner also noted that while XML may be distant for client side delivery, corporate Web sites can now benefit from the metalanguage as it is used in the transfer of information between a database and a Web server.

Webmasters would also do well, Turner and other panelists advised, to start learning about the structure of XML so that when it becomes a common Web technology, they can make the transition with the least amount of work.

"Make sure you're not designing your Web site in a way that will make it harder to switch to XML," Turner advised. "You should ask yourself, 'How do I build my data structures? How do I get information out of data repositories?' Find out what tools you need to build XML structures."

Some sites will do fine with standard HTML, panel members said, depending on how their information will be used. If the information will not be reused or sold in electronic form, the transition to XML will be less urgent.

One of the reasons Turner is confident in XML's future is that his own company is so strongly behind it.

"Virtually every application shipping from Microsoft in the next couple of years will use XML for its own purposes," Turner said.

One upcoming example is Microsoft's dominant word-processing application. Using XML tags, it will let users save a Word document as an HTML document, then make the round trip back to Word. The original information about the style and format of the original Word document--the "metadata"--will be carried in XML tag sets known as XML data islands, or blocks of XML embedded in the HTML document.

One thing standing in the way of XML's widespread use in the intranet environment is a dearth of tools designed for it. One such tool yet to be developed by database vendors, said Turner, will let data repositories automatically generate information in XML. Another area underdeveloped is that of XML editing tools, although at least one firm, Vervet Logic, now produces one based on Java.

One reason XML is a natural fit in corporate intranets, said Travis, is that it has taken the place of its parent metalanguage, standard generalized markup language (SGML). SGML for the past ten years has been in fairly common use with corporate databases, and several of the attendees' questions concerned the transition form SGML to its slimmed down successor.