Web-page speed guru Steve Souders, putting to use the latest in a string of useful tools he's created, has found that the top 17,000 Web sites have eased off use of Adobe Systems' Flash Player in the last half year.
Specifically, Souders has started showing data collected by his HTTP Archive project, which logs a wide range of statistics about a collection of 17,000 top Web sites. He began logging data last year but only announced the HTTP Archive at the end of March.
The site lets people compare statistics about how Web sites are built from two points in time. One figure that's interesting, given Apple's high-profile attempt to wean the browser world from its reliance on Flash, is a 2 percent drop in Flash usage from 49 percent on November 15, 2010, to 47 percent on March 29.
That's not a huge fraction, but it is probably notable given that it took place over only four and a half months. I'll be keeping an eye out to see if a trend emerges, but I'm hesitant to be too conclusive at this stage; for example, Flash usage actually increased to 50 percent for the December 16 HTTP Archive data.
The archive is fun to poke around, but it's also a handy tool for engineers seeking real data about the Web. Souders hopes it'll be useful for improving Web page performance, which is a very big deal.
That's because people on the Web abandon sites that are slower to respond and spend more time with those that are snappy.
Google, which makes money when people spend more time on the Web, is working to improve performance not only of its own sites but of the Web overall. It's got tools for measuring page speed, the Chrome browser that makes , and technologies such as and that it's trying to promote to speed things up.
Successful societies and institutions recognize the need to record their history - this provides a way to review the past, find explanations for current behavior, and spot emerging trends. In 1996 Brewster Kahle realized the cultural significance of the Internet and the need to record its history. As a result he founded the Internet Archive which collects and permanently stores the Web's digitized content.
In addition to the content of web pages, it's important to record how this digitized content is constructed and served. The HTTP Archive provides this record. It is a permanent repository of web performance information such as size of pages, failed requests, and technologies utilized. This performance information allows us to see trends in how the Web is built and provides a common data set from which to conduct web performance research.
The 17,000 Web sites are a combination of several collections including the top 10,000 lists of Alexa and Quantcast and the Fortune 500. They're scoured by a computer set to look like Internet Explorer 8 using a DSL connection in Dulles, Va.
Peter-Paul Koch, a consultant and close watcher of Web site practices, lauded the HTTP Archive as useful in a blog post this week.
"Once it's been gathering data for a year we'll have a fascinating insight into what works, what doesn't, and what clueless Web developers do," he said.
And who doesn't want more data? For example, Koch focuses heavily on mobile phone use of the Web right now. It "would be interesting to see what happens when we change the UA [user-agent identification] string to a mobile browser," he said.
Among other changes from November to March:
The average size of images across the collection increased from 415KB per page to 450KB per page; the average size of scripts increased from 113KB to 123KB; and the average size of Flash content dropped from 90KB to 84KB.
Among image formats used, JPEG was level, accounting for 43 percent of the images. PNG rose from 16 percent of graphics to 18 percent, and GIF dropped from 41 percent to 38 percent.