Want CNET to notify you of price drops and the latest stories?

Language barriers on the Web?

The W3C introduces a second version of XHTML that won't be compatible with its predecessors, completing a migration to XML while creating anxiety among Web developers.

Paul Festa Staff Writer, CNET News.com
Paul Festa
covers browser development and Web standards.
Paul Festa
6 min read
As the Web marches into the future, some developers say they're concerned about what will become of its past.

At stake are new specifications approved by the Web's leading standards body that would complete the transition from HTML to XML as the fundamental language used to build Web pages.

The changes, which affect a budding Web design language known as XHTML, were approved by the World Wide Web Consortium (W3C) last week amid a flurry of documents on topics ranging from voice browsers to TV style sheets. The first, a second edition of XHTML 1.0, corrects errors in the published recommendation, while a working draft of XHTML 2.0 marks a significant departure from its predecessors.

Of primary concern to some Web developers is the W3C's warning that XHTML 2.0 will not be "backward compatible" with HTML 4.0 and XHTML 1.0. That alert has raised concern that billions of existing Web pages risk obsolescence unless they are translated to the new Web language.

Most developers see any significant clash between the old and new languages as a long way off. But some say the lack of compatibility will immediately hold them back from switching to XHTML 2.0, a reluctance that could complicate what many see as a necessary evolution for the Web.

"I'm really hesitant over the line in the new spec (v2) that reads, 'While the ancestry of XHTML 2 comes from HTML 4, XHTML 1.0, and XHTML 1.1, it is not intended to be backward compatible with its earlier versions,'" Frances Currit-Dhaseleer, a technical trainer and Webmaster in Colorado Springs, Colo., wrote in an e-mail interview. "What exactly does this mean? Does this mean that everything else is obsolete? If that's the case, it'll be a long time before I move over to XHTML."

XHTML, first recommended by the W3C in January 2000, attempted to redesign the Web's standard markup language from HTML, which was increasingly considered a jerry-built, relatively unstructured improvisation on the earlier Standard Generalized Markup Language (SGML).

The W3C's goal was to start translating the Web into Extensible Markup Language, or XML, a highly flexible but also tightly structured technology that lets developers create task- or industry-specific markup languages while cracking down on basic syntactical rules that HTML left open-ended.

The trouble with HTML's permissiveness was that Web browsers were required to make assumptions about Web authors' intentions, leading to bloated code.

The HTML-XML hybrid
At a W3C meeting four years ago, the consortium's members decided on a gradual shift away from HTML. XHTML 1.0 was created as a hybrid to start weaning the Web's developers and authoring tool makers away from the legacy markup language and over to a new XML-based future.

"XHTML 1.0 was designed as a bridge between HTML and XML," said Ann Navarro, president and founder of WebGeek, a consulting firm in Port Charlotte, Fla., and an editor of the XHTML 2.0 working draft. "XHTML 2.0 is 'the other side' of that bridge, dropping much of the deprecated content and moving forward into new, more 'XML' methods of accomplishing tasks."

It may be a while before Web authors and surfers begin to encounter evidence of XHTML 2.0 incompatibility with legacy HTML and XHTML 1.0 code.

That will occur only when browsers start supporting XHTML 2.0 and stop supporting its predecessors, a process that will likely take years after the W3C finally issues the new specification as a recommendation.

But Navarro warned that it was only a matter of time before that transformation occurred.

"There's going to be a cut-off point," Navarro said. "I don't know when it's going to be, in the next version of the browsers or at some other time, but there will be a cut-off point. If we're going to move the Web to XML, we've got to move it."

Analysts agreed, calling XHTML the only way forward.

"HTML is dead," said Uttam Narsu, an analyst with the Giga Information Group. "Web developers have to accept this and move on to XHTML."

Along those lines, XHTML 2.0 offers a trio of new capabilities that may entice Web developers to start making the switch before they absolutely have to.

XHTML 2.0 introduces the idea of document sections with generic heading elements. That means that sections, as in an outline, can be nested infinitely, and each heading can be associated with its depth in the hierarchy. That way, for example, a table of contents could assign attributes or styles to every chapter, while every chapter subsection would have its own attributes.

Version 2.0 also introduces the navigation list element. This lets Web authors activate and style links within a list of pages in a Web page, something that now requires the use of scripting.

The working draft incorporates XForms, a technology introduced shortly after XHTML that makes Web forms more portable to small Web-access devices and compatible with XML. XForms are dynamic, meaning that their queries can change based on new information as it is entered and validated.

Riding the XHTML wave
There's some evidence that Web authoring tool developers--a crucial link in the move to get any W3C recommendation into widespread use--are starting to ride the XHTML wave. In its new Dreamweaver MX Web authoring software, Macromedia offers a tool that automatically translates HTML into XHTML.

Some time is bound to pass, however, before developers and authoring tools adjust to the shift from artsy HTML to its strict successor.

"Sometimes developers use quirks of HTML to improve layout, design, etc," Ronald Schmelzer, analyst with Waltham, Mass.-based XML consulting firm ZapThink, wrote in an e-mail interview. "XHTML doesn't tolerate deviations from the specification, making it a bit more rigid as well as complex. What will be the challenge for XHTML is getting HTML designers to think like programmers--not an easy task."

Ad hoc XHTML development will be difficult, Schmelzer said, and designers who aren't up to par will be forced to use tools that "may not be mature enough to handle their design wishes, leading to somewhat of an impasse for XHTML: more complex than HTML, but less mature."

Analysts and developers called the burgeoning language's new features useful but probably insufficient to lure Web authors in large numbers.

"The big problem with XHTML 2.0 is that it's not backward compatible," Narsu said. "And I don't see any killer feature that makes a site author say 'I gotta have that.'"

One Web developer agreed that demand for the new markup language was unlikely to spike anytime soon--in part because XML itself has not become ubiquitous.

"I'm not actively using or converting code to XHTML for the simple reason that virtually none of our partners or clients are remotely close to using XML technologies in their most important applications," wrote Brian Schmidt, an independent software consultant based in Los Angeles. "Since they are sticking to traditional J2EE and Windows DNA architectures, so will we.

"When more critical applications depend on XML technologies, I'm quite certain we will be using XHTML extensively. But there's a big difference between experimenting with new technologies for a pilot project and implementing a critical Web-based system for a client who expects nearly 100 percent uptime and virtually no bugs," Schmidt said.

In other news, the W3C released a number of new drafts in various stages of completion.

The voice browser working group published the first working draft of Voice Browser Interoperation: Requirements, which details what voice browsers must have to share information about users and sessions. The working group is soliciting comments on the draft.

W3C working groups devoted to HTML and Scalable Vector Graphics (SVG) released a working draft, their second, of "An XHTML + MathML + SVG Profile." The profile shows how to combine elements of XHTML, MathML and SVG in a single document. The HTML and Graphics activities are soliciting comments.

The W3C promoted its CSS TV Profile 1.0 to the status of candidate recommendation, the penultimate stage in the W3C recommendation process. Part of Cascading Style Sheets (CSS) Level 2 and the CSS3 module: Color, the profile is designed for use with devices that display interactive content on a TV screen. The W3C is asking for comments through the month of January.

The W3C published the first working draft of XFrames, an XML application meant to replace HTML frames while fixing search and security problems posed by HTML frames.