Google raising newspaper morgues from the dead
Search giant begins a project to bring a searchable archive of old newspapers to the Web, in partnership with publishers.
Updated 2:57 p.m. PDT with Google's commentary about ad revenue sharing and other details. Also, my colleague Rafe Needlemanat TechCrunch.
Google is making searchable, digital copies of old newspapers available online through partnerships with their publishers, the company said Monday.
Under the ad-supported effort, Google will digitize millions of pages of news archives, including photos, articles, headlines, and advertisements, Google said.
"Around the globe, we estimate that there are billions of news pages containing every story ever written. And it's our goal to help readers find all of them, from the smallest local weekly paper up to the largest national daily," said product manager Punit Soni in a blog posting about the effort. "The problem is that most of these newspapers are not available online. We want to change that."
The effort is of particular interest to reporters such as myself who've made the jump from print journalism to online. When I started at CNET News a smidgen shy of 10 years ago, I was initially concerned that the online medium was more ephemeral than print.
But as soon as I realized that CNET's search box opened up our archive of work, I realized that online news actually is more permanent in many ways than a newspaper that's almost invariably recycled or thrown away within a day of its publication. Few have the time and money to visit a newspaper's archive of old papers, called the morgue, or flip through back issues in a state library's microfilm collection.
The results of Google's project initially will be available through the Google News Archive site, Soni said. "Over time, as we scan more articles and our index grows, we'll also start blending these archives into our main search results so that when you search Google.com, you'll be searching the full text of these newspapers as well," he said.
Google didn't reveal which publishers are partners except the Quebec Chronicle-Telegraph and two organizations, ProQuest and Heritage Microfilm. However, examples of the service showed pages from The Evening Independent of St. Petersburg, Fla., the St. Petersburg (Fla.) Times, The Tryon (N.C.) News, and the Pittsburgh Post-Gazette.
The project expands on an earlier partnership to digitize content from The New York Times and The Washington Post, Google said.
The profit motive
With Google, it's often hard to tell what project is designed to contribute revenue directly and what's part of the larger corporate mission "to organize the world's information and make it universally accessible and useful," which can have the effect sometimes of making Google's search better, therefore used more often, therefore a better business.
The newspaper effort falls into this profit-and-loss gray area. Although the company is supporting it with advertisements, loftier goals were foremost in the mind of Adam Smith, the director of product management who oversees the newspaper effort, Google Book Search and related efforts.
"For us this is about improving the users' experience on the Web," Smith said. "Our objective is to bring all the world's historical newspaper information online in conjunction with our partners."
That's not to say money isn't involved. Google supplies advertisements on the right edge of the page that are based in part on the content in the newspapers, he said.
The majority of the ad revenue goes to the publishers, Smith said. (Update Sept. 12: Apparently I misheard Smith--it's only the majority of revenue, not the vast majority.)
And other revenue models are possible, he said. "There may be pay-per-view in the future, but we don't have anything to announce now," Smith said.
Although the project involves Heritage Microfilm and ProQuest, which both have microfilm archives, Google is doing the actual scanning of the film. The index has more millions of articles so far, he added.
Currently the system shows only images of the newspapers, not the text that's shown by existing news archive partnerships with newspapers that typically already have digitized much of their content.
Dozens of publishers are involved in the effort, he said.