The Fight over the Google of All Libraries: A Wired.com FAQ

By Ryan Singel
April 30, 2009 |
7:34 pm |
Categories: Media, Search

The Google Book Search Settlement has been much in the news recently, with the Internet Archive, Philip K. Dick’s heirs, consumer groups and Microsoft registering their objections to the search giant’s agreement with authors and publishers. And now Justice Department anti-trust lawyers are meeting with Google about the settlement, raising the possibility of a full-blown anti-trust court showdown between the government and the world’s biggest search and advertising company.

Google Book Search lets users search and read portions of millions of books.

Google Books lets users search and read portions of millions of books.

It’s a complicated story combining copyright law, anti-trust issues and the odd problem of orphan books.

It’s also the story of one company’s attempt to create the largest and most comprehensive library in the history of the world.

Here’s Wired.com’s guide through the thicket of the Google Book Search Settlement.

Google is a search engine, right? What do words printed on dead trees have to do with it?

Google claims its mission is to “organize the world’s information and make it universally accessible and useful.” If that’s your goal, then a library full of books makes you salivate in hunger for the knowledge held inside. So in partnership with major university libraries, Google began scanning and digitizing millions of books in 2002, from ones like Chaucer’s Canterbury Tales that are no longer copyrighted to the Harry Potter series to books whose authors and publishers cannot be located. The idea is simple, and audacious. Make the library of all libraries by converting every book ever published into an e-book that can be indexed, searched, read — and sold — online.

That’s cool! Where can I find this?

Go to Google Book Search, for one. You might also see book snippets in Google’s Web search results.

How many books are in there already?

Google has scanned more than 7 million books as of April 2009.

Can I download or buy old books through Google right now?

Yes and no. Google lets you download any book it has scanned that is not in copyright in the U.S. anymore – books that have fallen into the public domain. For other books, it shows up to 20 percent of the text, and usually includes links to places to buy it online.

What about new books? Are they included?

Most are, but that’s through Google’s Partner project that lets publishers and authors decide how much or how little of their books go into Google’s index, as well as letting them get a portion of the money from ads shown next to their book pages.

How did Google get away with scanning 7 million library books?

Well, there’s no problem with scanning millions of public domain books so long as you have the cash, cool technology and cachet to convince some of the world’s best libraries to work with you. As for in-copyright books, Google says it has the right to scan and index them, and show snippets online, under the Fair Use doctrine, which carves out exceptions to copyright holders’ rights. Being a massive company, mostly loved by users, also helps.

So could I go into the library and legally rip every music CD and video they have, and put snippets of them online, under the Fair Use doctrine?

That’s an interesting question. How good is your lawyer and how high is your bank balance?

Then why did the Authors Guild and the Association of American Publishers sue Google in 2005?

Well, once they saw Google using snippets of the books in search results and making money of it, they decided they deserved some of it. After all, they wrote the books.

Why did Google settle in 2007 if it has the right to do this? Especially since they have to pay $125 million in lawyer fees and past royalties?

Well, the settlement gives Google the legal cover to digitize all books written to date that are still in copyright. For books that are copyrighted but out-of-print, Google gets to show 20% of the book online and sell digital copies of it, keeping 37 percent. For books in-print and copyrighted, Google gets the right to scan the books and use them for research, and can do more with permission.

What about anthologies or photos licensed for use in a book? How does that work?

Well, that’s complicated. That’s partially why the agreement is 334 pages long.

Why should I care about the settlement at all?

Google. Monopoly. World’s Greatest Library. You do like books, don’t you?

Who manages authors and publishers’ rights if Google is going to be advertising next to book pages and selling books?

The newly-created Book Rights Registry is in charge of finding rights holders, collecting and disbursing payouts, setting prices and negotiating other deals. It’s not unlike the ASCAP system that collects royalties for song writers, musicians and publishers.

What about libraries?

Every public library in the country will get one free subscription for one computer that will let users read and print any page from the full text of all the books in Google’s catalog, excluding books still in-print. Beyond that, libraries and institutions can order additional subscriptions. The demand is likely to be high. Very high.

What is an author’s role in all this?

Rights holders can go to Book Rights Registry’s database and choose whether to let Google include their works, sell them online, and show snippets and ads. They can also opt-out and reserve the right to negotiate their own terms or sue Google later if Google includes their works.

How can Google get a monopoly? Can’t the Book Registry negotiate with other entities that want to do the same thing?

Yes, but only for those authors it can speak for – in other words, the known authors of copyrighted books.

Is the opposition to the settlement all about the so-called orphans?

Yes. There are more orphans than in a Dickens novel. Google won’t say how many there are. But UC Berkeley Professor Pamela Samuelson estimates that 70 percent of books that are still in copyright have rights holders that can’t be found.

What’s the problem with orphans?

Copyright infringement can be expensive – up to $150,000 per violation. So if you scan an old book and start selling copies of it, or displaying chunks of it on the web, and the orphan’s father shows up one day waving a paternity test in your direction, you could face a mean copyright infringement suit. Unless you are Google: Since all U.S. book copyright holders are now plaintiffs in the lawsuit, Google gets liability protection from authors who abandoned their books by not registering in its books database. If they show up later, all they can do is collect a little cash, change their book price or ask Google to stop selling the book.

Could Google end up with the most comprehensive online library in the world? Won’t libraries place thousands of subscriptions due to overwhelming demand? And since there’s only one vendor (Google) and the Book Registry will set the price, won’t the price be incredibly high? Or at least climb that way over time?

Bingo.

Why can’t Amazon or Yahoo or Microsoft go to the Book Registry and get an orphans waiver like Google is getting?

The registry can set rates and negotiate contracts for all authors, unless they opt-out. But signing away unknown people’s rights to sue? Only a judge in a class-action lawsuit (or Congress) can do that.

If another company wants to digitize, display and use orphan works without the Sword of Damocles hanging over its head, it has to start digitizing without permission, get sued by a reasonable plaintiff and the go through this settlement process again?

Exactly.

That’s ridiculous. Isn’t there a better solution to the orphan works problem?

Yes. For one, Congress could step up and pass a law about orphan works. But the last time Congress passed a substantial law about the length of copyrights it extended them for 20 more years — keeping more and more books from reaching the public domain. Don’t expect much help here.

Is a lot of money at stake?

If you think all the value in digitizing the world’s knowledge will come from selling out-of-print books as e-books for an iPhone, you’re not thinking like Google is. Think of all the subscriptions that universities and colleges and high schools and corporations will need to buy. Think of how search could be improved if you can test your algorithms on a huge digitized swath of the world’s knowledge. Think of the data that could be mined from that index, or how a question-and-answer service resembling artificial intelligence could be created. Google “the Singularity.” Or better Google Book Search “The Singularity”.

Why is the Justice Department getting involved? Why I am reading that it is investigating?

Remember last fall when the Justice Department was hours from filing an anti-trust lawsuit against Google for its planned ad partnership with Yahoo? That made it very clear that Justice Department considers Google to either be a monopoly, or be very close to being one – at least as far as search advertising is concerned. So when outside groups wrote the Justice Department with concerns about Book Search, it’s not surprising that lawyers there started sniffing around the settlement. Given that Google admits it has met with DoJ lawyers about the settlement and has to do so again, it’s clear this is more than just a passing interest for DoJ lawyers, who could make big names for themselves in the legal world for taking on the search-and-advertising giant.

When does all this end and I get to start browsing the library of the future and buying out-of-print books?

Authors and publishers have until September 4 to opt-out or make their initial choice about what Google can or can’t do with their work. The federal court’s final hearing on the fairness of the settlement comes a month later, on October 7. Then the judge has to rule, which could take months. In the best case scenario for Google, it will have something resembling the library of the future online sometime in 2010, but given the number of lawyers eying this deal and the potential amount of money at issue, one can be pretty sure the legal battle will drag out far into the 2010s.

Post Comment |
Permalink

Comments (14)

Sign in to comment

Username:

Password:

Remember me

Forgot your sign in information?

Not a member?

If you're not yet registered with Wired.com, join now so you can share your thoughts and opinions.

It's fast and free.

Registration

E-mail address:

Username:

Password:

Password must be at least 6 characters.

Confirm password:

Please send me occasional e-mail updates about new features and special offers from Wired.

Yes No

Please send occasional e-mail offers from Wired affiliated websites and publications and carefully selected companies.

Yes No

I understand and agree that registration on or use of this site constitutes agreement to it's User Agreement and Privacy Policy.

Already registered? Click here to sign in.

Retrieve sign-in

Please enter your e-mail address or username below. Your username and password will be sent to the e-mail address you provided us

E-mail address

Username

Posted by: imaduck | 05/1/09 | 8:20 pm

As a scientist, Google Book search has been infinitely useful. It is an incredible waste of time for me to have to visit one of the seven libraries on campus - many of them far away from my building - search for the book hidden amongst millions of others and drag it up so I can research one equation from a book written in 1950. Even worse is when the library’s copy of the book is missing. Online journals have revolutionized the field, and online book searching is just following suit.

For more recent books, I can understand the issue on both sides. However, the book industry - just like the music industry - needs to learn that any exposure is good exposure, people looking at 10% of your book probably aren’t going to buy it anyway, and those that do may have solely bought it based on the fact that they were able to preview it online - the equivalent of browsing through a book at the bookstore.

But all you have to say is the magic phrase, “somebody else is making money off of MY work,” and lawyer’s eyes light up. They’re not making money because of your work, they’re making money because they developed incredible search algorithms and amazing technology which can read books quickly. Get off your high horse and let the technological revolution occur peacefully.
Posted by: pepelicious | 05/1/09 | 10:27 pm

Once again, Google is dragging us, some kicking and screaming, in to the future. As much as you want to call Google a monopoly, show me any other organization out there who could even dream of competing with something like digitizing and making widely accessible the sum of our civilization’s recorded knowledge.
Posted by: MiracleJones | 05/2/09 | 6:21 am

You should really link to the Fiction Circus’ lengthy and informative podcast with James Grimmelmann who is writing a critical amicus brief about this case for the New York Law School.

http://www.fictioncircus.com/news.php?id=356&mode=one

For instance, you completely left out the fact that Google will be creating a content management system with their scans ala YouTube and will be able to omit books at will for both editorial and non-editorial reasons.
Posted by: slger123 | 05/2/09 | 1:48 pm

http://Question: When will Google Book Search be fully accessible for all print disabled users?

Currently images show reference information that cannot be accessed as text and spoken by a screen reader. This is a major discrimination for proposal writers, memoirists, reviewers, researchers in general.

Note that U.S. copyright provides exemptions for print disabled persons and organizations such as bookshare.org provide readable text to special ed students as well as all certified print disabled subscribers.

If there’s a way around his limit in Google Book Search, please let me and others know. I recently discovered multiple references to a paper I had written in 1975, but could not access critiques of nor progressions from those results. Further experiences are related in http://asyourworldchanges.wordpress.com/2007/07/14/seeing-through-google-book-search/ The blog search often shows queries such as “reading google book images” that suggest others are hampered in their personal or professional research.

Distributing out of print books in PDF is kind, and useful for some. However a large amount of technical material remains inaccessible, even to its own authors who have lost vision.

Google should work out agreements with services for print disabled people perhaps through transfer of certification, e.g. from bookshare. I’d be satisfied receiving an email of OCR of pages shown in search results.

The digitization of 7 M books should bring advantages not discrimination for persons with print disabilities.
Posted by: griffey | 05/2/09 | 8:37 pm

FYI: Point 13 isn’t technically correct. Every _Public_ library in the US will get one “free” terminal. But Academic Libraries (the very libraries where Google got the majority of their books to scan), School libraries, and special libraries (like Law libraries, Medical libraries, etc) don’t get the same free terminal.
Posted by: OzBeefer | 05/2/09 | 10:55 pm

Just like Microsoft, whenever people see a large company dominating the market because of a fantastic product, they get defensive and start bringing in the lawyers.

Same is now happening with Google, as for a long time they’ve been a LOT more than just a search engine - more power to them!!!
Posted by: mugginsx | 05/3/09 | 5:42 am

I’ve used Google Book search and it is a great step forward. imaduck mentions having to walk to another library on campus to get a book, but for many people its more of an issue of going to another city, or for researchers and book writers, possibly another country to see some books.

however, i do have an issue with the orphan works. The status of orphan works is actually a big issue for people in the writing, art, publishing industry and Congress recently had a new bill dealing with it (The Orphan Works Act of 2008) which so hasn’t passed, I believe, and am glad for. There’s a good article from the NY Times which explains the issue better than I could. But I definitely think the issue of Orphan Works should be worked out soon, and for the mutual benefit of creators and content users.

http://www.nytimes.com/2008/05/20/opinion/20lessig.html?_r=1&scp=1&sq=orphan+works&st=nyt
Posted by: nurse607 | 05/3/09 | 3:06 pm

Question #23 says:

“Why does the Justice Department getting involved?”

I believe “is” should be the operative verb here, not “does.”
Posted by: Scriptable | 05/4/09 | 4:23 am

As someone who loves the tangible nature of a good book, I was hesitant about this. But after continually finding some great search matches in old books I would never have a hope of finding or obtaining in the physical world, and many I had never even heard about, I am now a huge fan. I have gained a much broader and deeper understanding of my favoured subject areas of study.

Having them in all one location is a great time-saver. Its a pity such a thing can’t be a public service to avoid possible fragmentation due to eventual monopoly issues.
Posted by: eric_j_t | 05/4/09 | 6:15 am

How will this apply to authors from the UK?
A friend of mine, who published a book through a now defunct publisher, is anxious to have it made available on line. How can he get that done?
Posted by: AnitaBartholomew | 05/4/09 | 10:29 am

I’ve written extensively about the Google settlement, including why authors should opt out; why the Register of Copyrights is uneasy about the settlement; how authors can get a better deal if they bypass the settlement all together and go directly to the Google Books Partner program, and how the Authors Guild has misled about the terms of the settlement. See the details here: http://editorialconsultant.wordpress.com/
Posted by: Thesam | 05/4/09 | 10:54 am

Here’s another question I’ve always had. Is there any legal reason someone can’t download all the Google scans of public domain books and create their own, even more free (if somewhat old-fashioned) Google books?
Posted by: Macnymph | 05/5/09 | 3:02 am

I have two questions about this:
1. I thought the Library of Congress was going to scan in every book ever published? I know they ran out of money at one point, but is this still planned?
2. Do you know if Google also plans to scan in theses?
Replies would help me avoid a Google search
thanks!
Posted by: InklingBooks | 05/7/09 | 7:17 pm

You’ve brought some terrific humor to a topic perhaps too dominated by frowns, and you’ve also covered the topic well. But I would suggest that almost any solution the courts or Congress might make is likely to run afoul of international copyright treaties we have signed, even solutions to the problems of orphan texts that are far milder and that don’t favor Google as heavily as the current settlement. If the courts or Congress attempt to solve this problem, they are likely to find that they have stepped into a gooey, stinky global mess.

Why? Because changes to how copyright law is applied in the United States don’t just apply to U.S. citizens. Unless carefully worded to permit exceptions for works by foreign authors, they also apply to citizens in every country that has copyright treaty agreements with the U.S., and that’s almost every one of them. The FAQ that Google has posted attempts to slide past that fact by referring to “U.S. copyright interests”–legal language that may be accurate in a legal context, but for most readers obscures the fact that what the settlement grants Google does not just apply to copyrights that you and I might hold as U.S. citizens. The settlement also applies to treaty-granted copyrights that citizens of other countries acquire when they get a copyright in their own country. That’s one reason I’ve gotten involved in this dispute.

I’m one of the seven authors (or their representatives) whose letter led the court to extend the deadlines in the Goggle settlement by four months. From my contacts in Europe, it seems that the extension was particularly good news for writers in Europe, who are becoming increasing aware of what it may mean for them. It also gives more time for writers in this country to study the agreement. As I pointed out above, even the otherwise well-written FAQ page set up by Google doesn’t give a good picture of its full implications.

You can find a copy of our letter to the court requesting a four-month extension, along with other documents in the lawsuit at a webpage for downloads and links that I have set up.

http://inklingbooks.com/googlesettlement/googlesettlement.html

Feel free to post that link elsewhere. I particularly recommend:

* The Brewster Kahle interview on video. In it the founder of Internet Archive does an excellent job of explaining what the settlement means, particularly in relation to anti-trust issues. There’s also a transcript beneath the video. It’d be great if this video could be dubbed into multiple languages.

* The Spiegel article on the growing opposition of German writers to the settlement.

* The Wall Street article by Lynn Chu detailing what the settlement could mean for writers.

* The American Historical Association critique of the value of Google Books for researchers, particularly quality issues.

* The six-month extension request delivered to the court three days after ours and signed by 16 academic writers. One important criticism of the settlement is that those tasked with representing what it calls “the author sub-class” appear to be representing only a limited slice of writers and other copyright holders.

For the moment, I’m doing my best to link to developments in Europe until (hopefully) one or more websites or blogs in Europe take up the task and do it far better than I can. It’d be particularly great to have one that reports European news in English for the U.S. media.

–Michael W. Perry, Seattle