Updated 10:57 a.m. PDT with comment from MLB.com.
When is 2 million people too small a group to get a good feel for the latest trends?
When you're trying to perform the deceptively difficult calculation of just how many people visited a Web site.
ComScore, by some accounts the leading analysis firm that supplies publishers and advertisers with figures about how many people visit various Web sites, announced Tuesday a switch to a new method geared to provide more accurate Web traffic numbers. Previously the company had judged traffic on the behavior of a panel of 2 million persons scattered across the globe, but the new service, Media Metrix 360, combines that data with statistics taken directly from the Web servers that can tally visitor totals.
The hybrid approach is a big departure for ComScore, which for years has staunchly defended the panel approach. ComScore Chief Executive Magid Abraham expects the move will put to rest most complaints by publishers that ComScore's statistics dramatically underestimated real popularity.
"There is a big gap that we will rectify," Abraham said. "My personal belief and hope is it will address 90 percent-plus of the issues...Unless there is a widely different method for measurement that a publisher is using on their own, we should not really see differences anymore."
The hybrid approach still is fundamentally panel-based, but uses server data from participating publishers to inform the totals. ComScore expects it will estimate traffic better from computers that panels miss today: mobile phones, larger companies, cybercafes in Asia and Latin America, and public terminals at schools and libraries, Abraham said.
Measuring traffic is a long-running point of contention between publishers and ComScore; MySpace and Major League Baseball's MLB.com are among those who've objected in the past to panel-based numbers.
Torstar Digital, a division of the parent company that owns the Toronto Star, is one company that's had trouble reconciling its newspaper sites' numbers with ComScore when it came to tallying how many times visitors viewed pages, said President Tomer Strolight.
"We regularly saw discrepancies between ComScore page views and unique visitors of 3 to 1 or greater when compared with our server-based tools on those sites," Strolight said. ComScore's panel-based approach posed problems when it came to extrapolating statistics from people at work and older audiences, he said, but the new service helps.
"The discrepancy, and that ratio in particular, are not present in all sites by any means, but it does happen to affect my largest site and it is therefore very important to me that we resolve that issue," Strolight said. "ComScore's new methodology can do this."
MLB.com CEO Bob Bowman gives credit to ComScore for moving beyond just panels, but he's still skeptical, in part because of what panels miss from work users. Instead, he'd prefer to share his server log files after they'd been vetted by a neutral party.
"For our site, 70 percent of day traffic comes during working hours. Missing that is comparable to saying we built a beautiful boat--it just doesn't have a bottom yet...I reject completely that panels can ever work when it comes to what people do at work," Bowman said. "I don't know why for top-250 sites, such as MLB.com, our files just can't be audited, and (the auditors) say yup, here's the traffic."
So why does the number matter beyond bragging rights in a press release?
Money, of course. Advertisers want to know if they're showing an ad to the same person multiple times or to different people. And of course Web publishers want to know how many people really do use their site.
"A media market develops on the basis of trusted information for all the participants. The sellers and buyers have to agree the numbers are something they can live with," Abraham said.
Not as easy as 1-2-3
One might think it a simple matter to measure how much traffic a Web site gets. Just keep a log of the Internet addresses of visitors, or perhaps deliver the "cookie" text file to their browser for easier identification of repeat visitors, right?
Wrong. There's often a discrepancy between independent panel-based statistics from companies such as ComScore or competitor Nielsen Online on the one hand and server-based statistics from a Web publisher's internal logs or third-party services such Google Analytics or Omniture on the other. Here are some factors that can inflate Web site visitor statistics based on server logs:
The same person might visit the same site from work, home, and increasingly, from a mobile phone. That's not a problem when counting total traffic to a site, but it is when trying to tally unique users.
Sometimes people delete cookies either manually or automatically through antispyware software, meaning that a cookie might be delivered to a person who seems to be a new user but who in fact has visited a site before.
Someone might visit the same site with multiple Web browsers or open a tab in a browser without actually making it active.
Computer servers such as search engine indexers can visit Web sites.
These issues are diminished when users must log into a site, making it easier to track individual use, but the panel approach attempts to address the issue more broadly, consistently, and independently. Panel-based information also answers a question that an individual site cannot: how often is a particular person exposed to the same ad while browsing multiple sites on the Web?
"Ultimately, we need to report unique people, not unique machines, unique cookies, or unique browsers," Abraham said. "There is a lot of energy that goes on trying to reconcile the numbers and trying to explain to people the ins and outs and the subtleties of why this number is not that number."
However, panel-based measurements have their own shortcomings, in part because they rely on software installed on users' machines. Thus the difficulties with mobile phones, businesses, cybercafes, libraries, and schools, Abraham said.
Cybercafes are widely used in Asia and Latin America, he said. Mobile usage for typical sites accounts for less than 1 percent of traffic today, but it's much larger--potentially more than 20 percent--for sites that appeal to mobile users such as those handling weather, stock quotes, breaking news, sports scores, local information, and social networking. And today, ComScore largely just estimates traffic from big-business users.
"It's really difficult to recruit users to participate in panels in large corporations," Abraham said. "Large businesses are in essence voted for by the medium-sized businesses, by proxy. Sometime that works, sometimes that doesn't. That's one area of improvement this (Media Metrix 360) will create."
The company has begun offering panel software to some mobile users but doesn't yet publish resulting data. "We do have that developed for a number of smartphone platforms such as Windows, Palm, and (BlackBerry maker) Research In Motion. We are working on solutions for iPhone and Android," but it's difficult to deal with the plethora of models in the market, he said.
ComScore recruits panel members by offering them free software such as games and screensavers and through incentives including sweepstakes and, more recently, an offer to plant trees in third-world countries. The panel size of 2 million people spans 170 countries, enough for global estimates and for specific measurements in 40 countries. The company also uses technology that can distinguish different users on the same machine by identifying signature patterns in mouse and keyboard use, an important factor for shared computers.
The switch to the hybrid methodology will be gradual. Publishers must add a transparent pixel to their Web sites that ComScore uses to track visitors, Abraham said, and participating sites undergo a 60-day "incubation period" to make sure data collection is working and nobody is gaming the system, he added.
"We think we'll get widespread support. There is widespread hunger for this," Abraham said. "From many people we've heard, 'What took you so long?' It's a fair question. The answer to that is it's not as easy as it first appears."