Bernardo Huberman, Hewlett-Packard's director of the HP Social Computing lab, and fellow researcher Gabor Szabo have published a highly detailed report (PDF) on "predicting the popularity of online content." Focusing on content submitted and popularized on popular social sites Digg.com and Google's YouTube, the two concocted not one but three ways to predict how much traffic and overall user interaction a story or submitted video will receive well after it hits its initial popularity.
To do this the pair kept an eye on 7,146 videos from YouTube's recently added section, and every digg from registered digg users between July 1, 2007, to December 18, 2007. From this data, they found that stories on Digg got more votes and views during peak traffic hours than those at nights and on weekends (duh), and that YouTube videos tended to get more and more views a month into being submitted--and in many cases well beyond the initial 30-day evaluation.
To dig a little deeper into this data, they were able to figure out which time of day story submissions on Digg had the most chance of getting attention, right down to the hour. The data also showed how many diggs a story would get after being promoted to the front page depending on both what time that story hit and when it was originally submitted. The lesson: submit, and hit the front page early.
The prediction models, which you'll have no problem understanding if you paid attention in your grad school numerical analysis class, outline three different ways to guess any one submission's popularity. All three depend on any number of variables, as dictated by Huberman's research, including what time of day you're submitting compared with how many others are submitting at the same time.
One thing that slightly outdated the research done on the Digg-side is the somewhat-recent introduction of the recommendation engine. Digg has been quite vocal with the success of its engine, both in terms of additional traffic and higher user interaction levels.
Also, at the time of the survey Digg was just two weeks out from a redesign that put more emphasis on friends activity--a precursor to the mid-September overhaul of user profiles, which made the site resemble a social network. Neither of these things changed Digg's overall method of having popular stories roll off the front page in a matter of hours--something that hasn't changed during the lifetime of the site, but it's worth noting nonetheless.
I've embedded the paper after the jump. You can also track some of HP Labs' other projects on this page.… Read more