Caching comes at a price

Although it helps alleviate the universal complaint of online traffic jams, this technological progress faces opposition from some major site producers.

6 min read
An array of new caching technologies hitting the market this year promise to speed access to popular or graphic-rich Net sites.

That's the good news.

Although it helps alleviate the universal complaint See news analysis: Browser wars affect advertising of online traffic jams, this technological progress is facing opposition from some major Web site producers. The problem, they say, is that caching means fewer updates--and fewer updates mean fewer new ads that can appear on pages.

"If the world turned into a situation where 90 percent of the traffic was served this way, our traffic would go down," said Patrick Naughton, president and chief technology officer of Starwave, which produces ABCnews.com, ESPN Sportszone, and Mr. Showbiz, among other sites. "It simply can not go the way these guys expect it to go."

The "guys" he's talking about are the industry heavyweights that are pitching cache products to corporations, Internet service providers, and bandwidth suppliers--echoing concerns similar to those raised about "push" technology last year. With Intel's release of its highly publicized Quick Web technology Monday, the perils of caching were once again brought into sharp focus for many of the Web's hottest destinations. (See related story)

The practice of caching speeds surfing by storing frequently used pages or information, obviating the need to download the data directly whenever it is requested from the host server. The technique has been used for years by such online stalwarts as America Online to offset perennial congestion on their services.

Caching comes in many forms, such as technologies used by companies to trace the network path to its employees favorite Web sites. In another example, a Web browser cache may store pages, graphics, or site addresses on a Net user's hard drive to lessen the downloading process.

But just as the technology has done in the past, the new wave of caching is presenting old pitfalls for news services and other content sites that need to deliver the most current information to Net users at all times.

Moreover, some content sites worry that caching makes it impossible to track all unique visitors, preventing the collection of crucial figures for many Net advertising revenue models. The growing list of cache products raises yet another nettlesome issue: the debate over the legality of stashing temporary copies of other's online intellectual property.

"None of them do a good job reporting usage statistics back to the publisher of the content. It's like they are stealing the content when one user accesses it from our site, delivering it to a hundred other people, and we have no ability to count those people," Starwave's Naughton charged. (Paul Allen, founder of Starwave, is an investor in CNET: The Computer Network.)

Like complaints about push technology, content providers fear they can't track eyeballs for advertisers if their Web pages are cached. Also, many sites deliver tailored ads to unique visitors and base their ad prices on the number of people who are hit with the ad. Proxy servers, which cache material, often eliminate the need for Net users to hit a site every time they request a graphic or text link by clicking on an item. Thus, a site's traffic and "ads delivered" numbers could be grossly inaccurate if multiple users are hitting the proxy server copy instead of the actual site.

But many cache product makers say Web publishers' fears about the technology are unfounded. These companies say the many forms of caching result in one thing for the user: a better online experience.

Those same content providers have the most to gain from a faster, crisper Internet, according to cache technology companies such as Intel, Inktomi, and CacheFlow.

Intel's Quick Web product joins a host of other caching solutions offered by Cisco Systems, Novell, Microsoft, Netscape, CacheFlow, PeakSoft, and Inktomi, which has partnered with Sun Microsystems to sell its product. (Intel is an investor in CNET.)

Quick Web allows Net service providers to charge a fee to improve customer's access to graphic-heavy Web sites. When participating Net users hit a site, Quick Web compresses huge graphics and caches via a server based at the ISP. So when another customer of the same ISP hits the site the graphics already will be stored locally--preventing a timely trip back across the Net to gather the same data.

In addition, cache companies offer a solution to the tracking issue. By simply applying an HTML tag on all graphics and text within a Web site, content providers can prohibit their timely content or dynamically generated advertisements from being cached.

As long as a proxy server honors the specifications in what is known as the "Pragma general-header field" in the HTML, which can be flagged "no-cache," a site's components will not be sucked into a caching system.

"What it really comes down to is economics. Is someone being deprived of payment for their content or losing control of their content? If a publisher says don't cache my page, we don't cache it," said Kevin Brown, director of marketing for Inktomi, the maker of Traffic Server, which is said to cache more than a terabyte of data and is marketed to large ISPs, backbone carriers and telecommunications companies.

"The ideal situation," he added, "is where we can cache a majority of the graphics for a site but not interfere with a site's core business model. We could still cache the icons, buttons, and graphics, which are the majority of the bytes of the page, but still go back to their server every time to get the fresh content."

Starwave tags all of its site components "no-cache." So does MSNBC.

"For the most part, if the proxy server honors the request not to cache then everything works perfectly. All the stuff we don't want cached, we have tagged that way," said Charles Simon, a group program manager for MSNBC.

Time Warner's Pathfinder recently began testing the "no-cache" tag on traffic originating from AOL, which caches an array of Net content for its more than 11 million subscribers. The outcome of Time Warner's experiment underscores why heavily trafficked sites often try to bypass proxy servers.

"We did this for Sports Illustrated for Kids, and we had a 50 percent jump in documented pages views," said Graham Cannon, the spokesman for Time New Media, the division that oversees Pathfinder. "Now from AOL we get more than a million pages views a week on Sports Illustrated for Kids, and it's not even a big site compared to People, Money, and Fortune.

"If our recorded audience is growing that high, it tells us that we are missing a lot of them," he added. "We know the reason companies cache is to make it better for their clientele. We don't want to slow down people's access to Web pages either, but we need to know who is accessing our site."

Time Warner and Starwave say they wouldn't have to block caching if companies made better products. For example, they contend that products shouldn't alter their content because of copyright restrictions.

"There are fairly extensive problems with Intel's approach. They're manipulating copyrighted material and displaying it at a reduced resolution," Starwave's Naughton said. "We publish those at a certain resolution for a reason. They can't just change our intellectual property."

Legal experts agree that caching is the sleeping giant of online copyright issues.

"Caching is a per-se copyright violation. Caching requires people to reproduce and distribute third-party content, and those are both copyright violations on the face," said Eric Goldman, an Internet and intellectual property attorney at Cooley Godward.

"If an article is placed in a cache, the publisher loses the ability to control the flow of that information," he said. "Let's say the article is defamatory and the originating source has already fixed the harm and run a correction. It's possible that the caching entity perpetuates the harm because they are distributing an old copy--they could be liable for that harm."

Cache technology companies and Web site producers do agree on one thing, however: Standards are needed to ensure that proxy servers more accurately deliver traffic figures back to publishers.

"We need to put together some sort of advocacy to group to deal with caching," Naughton said.

Intel agrees. An investor in Inktomi, the chipmaker has a big stake in the caching arena.

"The reality is that ISPs are going to cache, and people are going to cache," company spokesman Dave Preston said. "The fundamental issue of auditing hit counts is generic to all caching products. A protocol or standard is going to be needed. It's going to take an industrywide effort including the auditors, advertisers, and Web site hosts."

The bottom line is the same for all sides of the caching debate, Preston says: "The whole Internet wins if the Internet is faster. If the Internet is more powerful, there will be more PCs sold because people will want to get on the Internet and access all those Web sites."