Google raising newspaper morgues from the dead

Search giant begins a project to bring a searchable archive of old newspapers to the Web, in partnership with publishers.

Updated 2:57 p.m. PDT with Google's commentary about ad revenue sharing and other details. Also, my colleague Rafe Needleman covered Google's launch of the newspaper digitization work at TechCrunch.

Google is making searchable, digital copies of old newspapers available online through partnerships with their publishers, the company said Monday.

Under the ad-supported effort, Google will digitize millions of pages of news archives, including photos, articles, headlines, and advertisements, Google said.

Google's newspaper archive search and display effort is supported by ads, visible on the right edge.
Google's newspaper archive search and display effort is supported by ads, visible on the right edge. (Click to enlarge.) CNET News

"Around the globe, we estimate that there are billions of news pages containing every story ever written. And it's our goal to help readers find all of them, from the smallest local weekly paper up to the largest national daily," said product manager Punit Soni in a blog posting about the effort. "The problem is that most of these newspapers are not available online. We want to change that."

The effort is of particular interest to reporters such as myself who've made the jump from print journalism to online. When I started at CNET News a smidgen shy of 10 years ago, I was initially concerned that the online medium was more ephemeral than print.

But as soon as I realized that CNET's search box opened up our archive of work, I realized that online news actually is more permanent in many ways than a newspaper that's almost invariably recycled or thrown away within a day of its publication. Few have the time and money to visit a newspaper's archive of old papers, called the morgue, or flip through back issues in a state library's microfilm collection.

The results of Google's project initially will be available through the Google News Archive site, Soni said. "Over time, as we scan more articles and our index grows, we'll also start blending these archives into our main search results so that when you search Google.com, you'll be searching the full text of these newspapers as well," he said.

Google didn't reveal which publishers are partners except the Quebec Chronicle-Telegraph and two organizations, ProQuest and Heritage Microfilm. However, examples of the service showed pages from The Evening Independent of St. Petersburg, Fla., the St. Petersburg (Fla.) Times, The Tryon (N.C.) News, and the Pittsburgh Post-Gazette.

The project expands on an earlier partnership to digitize content from The New York Times and The Washington Post, Google said.

Google has tangled with news agencies before over who has rights to content. It settled a lawsuit with Agence France-Presse in 2007 and a similar suit from the Associated Press in 2006.

The profit motive
With Google, it's often hard to tell what project is designed to contribute revenue directly and what's part of the larger corporate mission "to organize the world's information and make it universally accessible and useful," which can have the effect sometimes of making Google's search better, therefore used more often, therefore a better business.


The newspaper effort falls into this profit-and-loss gray area. Although the company is supporting it with advertisements, loftier goals were foremost in the mind of Adam Smith, the director of product management who oversees the newspaper effort, Google Book Search and related efforts.

"For us this is about improving the users' experience on the Web," Smith said. "Our objective is to bring all the world's historical newspaper information online in conjunction with our partners."

That's not to say money isn't involved. Google supplies advertisements on the right edge of the page that are based in part on the content in the newspapers, he said.

The majority of the ad revenue goes to the publishers, Smith said. (Update Sept. 12: Apparently I misheard Smith--it's only the majority of revenue, not the vast majority.)

And other revenue models are possible, he said. "There may be pay-per-view in the future, but we don't have anything to announce now," Smith said.

Although the project involves Heritage Microfilm and ProQuest, which both have microfilm archives, Google is doing the actual scanning of the film. The index has more millions of articles so far, he added.

Currently the system shows only images of the newspapers, not the text that's shown by existing news archive partnerships with newspapers that typically already have digitized much of their content.

Dozens of publishers are involved in the effort, he said.

About the author

Stephen Shankland has been a reporter at CNET since 1998 and covers browsers, Web development, digital photography and new technology. In the past he has been CNET's beat reporter for Google, Yahoo, Linux, open-source software, servers and supercomputers. He has a soft spot in his heart for standards groups and I/O interfaces.

 

Join the discussion

Conversation powered by Livefyre

Show Comments Hide Comments
Latest Galleries from CNET
The best and worst quotes of 2014 (pictures)
A roomy range from LG (pictures)
This plain GE range has all of the essentials (pictures)
Sony's 'Interview' heard 'round the world (pictures)
Google Lunar XPrize: Testing Astrobotic's rover on the rocks (pictures)
CNET's 15 favorite How Tos of 2014