Google Book Search? Try Google Library

Friday's conference over the Google Book Search settlement will focus on privacy, quality, and Google's unique role as a private company operating a public library.

Tom Krazit Former Staff writer, CNET News
Tom Krazit writes about the ever-expanding world of Google, as the most prominent company on the Internet defends its search juggernaut while expanding into nearly anything it thinks possible. He has previously written about Apple, the traditional PC industry, and chip companies. E-mail Tom.
Tom Krazit
4 min read

Is Google ready--or willing--to become a library?

Librarians, academics, and privacy advocates will gather Friday on the campus of the University of California at Berkeley to discuss the implications of Google's proposed settlement with publishers that, if implemented, will allow it to bring millions of books online.

At issue are concerns over privacy, quality, and Google's intent with the project, the only one of its kind in the U.S. to receive the legal authority to scan books that are out of print but under copyright protection--estimated by the Internet Archive to comprise 50 percent to 70 percent of all books published since 1923.

Google's Dan Clancy will have his hands full defending the Google Book Search settlement Friday at a conference. Tom Krazit/CNET News

Almost from the day it was announced, the settlement has drawn scorn and scrutiny from authors, library groups, industry associations like the newly formed Open Book Alliance, and even the Department of Justice. Many are concerned that the settlement gives a private organization the sole right to essentially create and control a public good--a digital library--without explicit responsibilities to maintain that public good outlined in the settlement.

And, as UC Berkeley professor Geoffrey Nunberg put it, "this is the last library."

It's going to be extremely difficult for anyone else to create a similar digital library in the future, at least under the current laws. Any other organization that wanted to scan a large percentage of the world's books would likely have to go through a similar legal process that Google has followed for four years to gain access to those so-called "orphan works," a weighty expense even before you start counting the exorbitant costs of scanning the books themselves.

There's a sense among several of those planning to speak at Friday's conference that an Internet corporation--even one sworn to "do no evil"--does not necessarily share the same values and principles that librarians rabidly defend. And left unsaid, but by no means absent, is the growing scrutiny paid this year to Google's dominant position in the Internet search market and how that power squares with Google Books and the publishing industry.

Google's Dan Clancy plans to speak at the event, having been the point man for much of Google's outreach on the settlement. The American Library Association's Angela Maycock gave Clancy credit for listening to the concerns of library groups all year, but he's bound to get an earful Friday.

Big Brother concerns
Expect much of the debate at UC Berkeley to focus on privacy. Public libraries have long been considered anonymous places, where patrons can pursue their interests free from concerns about their browsing being tracked. The Internet, of course, is pretty much the complete opposite environment.

"Is Google going to provide the same kinds of guarantees that users expect, the ability to access books with relative anonymity? The legal document is silent on these concerns," said Michael Zimmer, a professor with the University of Wisconsin at Milwaukee. "I know the people at Google. I trust them, they are good people, but these are serious things."

Tom Leonard, university librarian at UC Berkeley, agrees. "We want users who use public libraries to feel very comfortable that their identifies will be protected," he said.

Google has a practice of executing innovative ideas far before the implications are visible. But Leonard also sees the upside to the settlement, assuming all the concerns can be addressed.

"We're pretty excited about the fact that the world has changed, and that we can give access the way readers want it," he said. "They want to make full-text searches of everything we have in the libraries."

Universities do have an alternative in the HathiTrust, a digital library project that counts UC Berkeley and the University of Michigan--also a close partner of Google's--among its partners. That service lacks the scope of what Google is potentially entitled to scan, but it curates the material in a fashion that's better suited to the needs of the academic community.

That's good, because at the moment, Google Book Search is almost laughably unusable for serious research, UC Berkeley's Nunberg said. For example, he pointed out that the Charles Dickens classic "A Tale of Two Cities" is listed in Google Book Search as having been published in 1800; Dickens was born in 1812.

There are still a few kinks in the data attached to Google Book Search. Screenshot by Tom Krazit/CNET

Nunberg plans to speak out on the quality issues with Google Book Search, although he readily concedes that the product was not designed for the needs of academics and scholars. But that only underscores the point: if Google Book Search is the only way to obtain a digital copy of a book 100 years into the future, scholars will have to depend on it for research, he said.

So what comes next? Friday, September 4, is the deadline for authors to decide if they wish to opt out of the settlement. It's also the deadline for interested parties to submit their comments regarding the settlement to the U.S. District Court for the Southern District of New York, which is overseeing the process.

There are definitely groups like the Open Book Alliance, who will be represented by Peter Brantley of Internet Archive on Friday, which would prefer to scrap the settlement and start over. "Google has a practice of executing innovative ideas far before the implications are visible," said Colin Evans, a "data wizard" at Metaweb and panelist on Friday.

However, it sounds like most of those in attendance are willing to give Google a shot as the digital librarian of the future so long as they adhere to the rules of the club.

"There's a lot of questions about how they will balance (their) mandate as a for-profit corporation and their mission to provide universal access to information," Maycock said. If it really wants to make the controversy over this settlement go away, Google needs to embrace "the ethical framework that libraries operate under," she said.