More questions than answers on Google Books

Google insists it has the best of intentions following its settlement with book rights holders, but there is a strong undercurrent of distrust in the publishing community.

Tom Krazit Former Staff writer, CNET News
Tom Krazit writes about the ever-expanding world of Google, as the most prominent company on the Internet defends its search juggernaut while expanding into nearly anything it thinks possible. He has previously written about Apple, the traditional PC industry, and chip companies. E-mail Tom.
Tom Krazit
4 min read

BERKELEY, Calif.--Google's Dan Clancy had patiently answered question after question regarding Google's' Book Search settlement with publishers and authors until late in the afternoon Friday, when he was finally left speechless.

Louis Trager, a reporter from Washington Internet Daily, asked Clancy what kind of message was sent when Google decided to "copy first and answer questions later." The question--for which there's no safe answer, if you're in Clancy's shoes--perhaps underscored the core of the opposition to the settlement, reached in October, after Google was sued in 2005 for scanning out-of-print works without explicit permission.

Google's Dan Clancy is charged with defending Google's position before opponents of its book search settlement. Tom Krazit/CNET

If the class action settlement is approved, Google stands to gain control of a priceless asset. Jason Schultz, acting director of UC Berkeley's Samuelson Law, Technology, and Public Policy Clinic, called it "the largest copyright-licensing deal in U.S. history:" the right to display the contents of out-of-print books that are still covered by copyright protection.

Google, however, has already scanned more than 10 million books. At the moment, it's not allowed to display more than a few snippets of copyright-protected books for which it doesn't have an explicit agreement with the rights holders. If the settlement is approved, Google will suddenly flip a switch and offer full-text searches of those books, as well as links to bookstores.

Nothing vexes Google's opponents more than the fact that the company assumed that it had the right to digitize nearly 100 years of written material without serious negotiations with those rights holders until it was sued. Authors have until Friday to decide if they want to opt out of the settlement and preserve the right to sue Google on their own for digitizing their book without their permission, though they can tell Google to remove their books from the Book Search archive, even if they remain in the class.

Everyone agrees that a searchable digital library of out-of-print books would be a very valuable asset for the world. As any owner of an e-book reader such as Amazon.com's Kindle will tell you, the way we think about books is changing.

Think about it: libraries offer tons of out-of-print books, so it's not like the collective knowledge of those books is inaccessible. Yet that knowledge exists in millions of hard-bound individual silos.

What if we could make all that knowledge instantly accessible from anywhere in the world? And more importantly, what if researchers have the ability to analyze it?

Amazing gains could be made in fields like linguistics. In dismissing arguments that scale makes a search engine better, Google's Hal Varian told me last month that one area that does seem to increasingly benefit from scale is translation: the more copies of bilingual books that Google has access to, the more it can perfect its translation algorithm.

"The value of the book as data is greater than value of the book itself," said Peter Brantley, director of the Internet Archive and perhaps the most vocal critic of the settlement. And who will control access to a valuable group of books? A for-profit corporation, which, by the way, paid just $125 million for the license to that information. It paid $1.65 billion for YouTube.

Google likes to say that anyone can cut deals with the Book Rights Registry, the nonprofit organization set up after the settlement to handle payments to right holders, to get similar access to out-of-print yet in-copyright books. The thing is, the number of organizations that can afford to duplicate Google's efforts is limited.

Clancy declined to say how much Google has spent on scanning books, but the Internet Archive spends about $30 for each book scanned. If Google's costs are similar, that's $300 million and counting; there are about 23 million books in the WorldCat database. Microsoft folded its book-scanning project, once it realized that Google was aggressively going after that market, said Tom Leonard, the head librarian at UC Berkeley, which had been part of a book-scanning partnership with Microsoft.

This is what frustrates Google, to a certain extent: everyone agrees that digital access to books is important, yet no one else is willing or capable of doing it. And Google insists that it will be a fair steward of the material: the European Commission has backed Google's efforts, and several university libraries, such as that of the University of Michigan, are also fully on board.

But taking Google at its word requires trust, and trust in corporations is in short supply at this point in American history. It's taken perhaps longer than it should have, but Google is gradually realizing that a fair portion of the public no longer sees it as a cute little Silicon Valley start-up with idealistic stars in its eyes, one that insists "you can make money without doing evil."

Google damaged that trust when it began scanning books without permission, arguing that it was allowed to do so under fair-use laws. Publishers and author groups also harmed that trust when they turned over the key to the castle by bringing the lawsuit as a class action, suddenly making plaintiffs out of millions of authors who did not necessarily appreciate the future value of digital books in 2005, nor authorize the negotiation of the rights to their works.

By the time the reporter caught Clancy off guard, he was understandably drained from a long day spent under hot lights fielding questions, and at least one diatribe, from passionate academics and activists.

The thing is, it's a fair question: Google has the financial resources and collective intelligence to do nearly anything it wants in the world. Where will Google turn its information vacuum next? Will it ask permission first?

Corrected August 30, 10 p.m. with the correct identification of the reporter who posed the question to Clancy.