Google’s book effort stuffs the copyright act

February 8, 2009

Google's book effort stuffs the copyright actGoogle will bring its e-books to mobiles. You will pay. The idea is that Google will make its vast library available to cell phones. American newspapers, which have been taught to sit up and beg when Google announces almost anything, plainly have absolutely no understanding of the reality of the case and why, in most cases, it is total nonsense.

The company said, “We are excited to announce the launch of a mobile version of Google Book Search, opening up over 1.5 million mobile public domain books in the US (and over half a million outside the US) for you to browse.” That is PR flackery from Flackery 101.

The reality is very different.

Initially, the books are scanned in at very high speed with no thought as to accuracy or to placement. In many cases they are totally unreadable. Scanning a book goes through two stages.

First the book is scanned.

Then the book is proofread.

In the initial stages Google did not proofread any of its books. Any. Ever. In any way.

One title checked in this office a few months ago had 1,313 errors in it. If a book publishing company put out a title with that many errors it would be laughed out of court.

Compare with Gutenberg which has an army, a positive army, of proof readers to see that they books are correct and readable.

Then Google, harassed by American publishers issued a Google Book Search Settlement Agreement which totally trampled over the Berne and Geneva copyright conventions. It was written in a type of legal flackese which suggested, falsely, the concept of Evergreen Copyright. You scan in an out-of-copyright book and Google cannot display it in its whole form.

Google did this because the publishers went ‘boo’. The publishers, had, in the main, done the square root of of sod all. But Google is gutless and someone had worked out the cost of proofreading and wanted to pass it on to the eventual user — you the mug punter.

Google,  in a post on Thursday on the Google Book Search blog, said mobile versions of the books could be read on devices such as the Apple iPhone or T-Mobile G1, which is powered by Google’s Android software. It made no mention of cost of reading out of copyright books.

As far as can be ascertained Google has decided to rewrite the copyright law to get around some of its proof-reading problems.

Take The Yangtze Valley and Beyond by the amazing Victorian adventurer Isabella Bird. Earnshaw Books publishes it but does not claim copyright.  It cannot do that. The book was originally published in 1899.

Google thinks it can. Google, on its own behalf, has in effect rewritten the Copyright Law so that any book republished acquired a new copyright from the new date of publication. Congress knows nothing of this. Nor does the Copyright Office in London.

Google's book effort stuffs the copyright actGoogle said, ‘These new mobile editions are optimized to be read on a small screen. With this launch, we believe that we’ve taken an important step toward more universal access to books.’

Stuff and nonsense. Rubbish. Knickers and bum and worse swear words I can think of.

Yes,  I have books scanned into my mobile phones. Yes, I read them. But FIRST, they are proof read. Properly proof-read so that they are intelligible, useful, add to the sum of knowledge. And I pay for the books if they are in copyright.

The wunderkind at Apple know not of proof-reading. (There is an allegation that none of them can read but this should be ignored.)

To proof read a book to any sort of acceptable level costs — to use a Fermi figure which will do for this calculation — $500.

So according to the Google flacks 2 million books are to be hurled at the unsuspecting public. To get them into properly readable shape would cost a billion dollars.

The reason why Microsoft stopped scanning in the vast library of pictures it had acquired and deposited them in a damp-proof Canadian mine is it found it cost, roughly, $50 a picture to scan them properly. The costs are nigh on unbearable.

If it were possible, and it is not, every book in the English language about China would be on the Internet under the colophon of Earnshaw Books. It has not happened simply because to make them readable — leave alone researchable — they need to be proofread.

What SHOULD happen now is the smug, self-satisfied sods at Google should look at Gutenberg and see how books should be prepared for portable readers.

Google’s announcement comes just days ahead of the expected unveiling by Amazon of a new generation version of its electronic book reader, the Kindle, at a New York press conference on Monday. (The PR people have put the word ‘popular’ before electronic but refuse to give sales figures. One wonders why?)

Drew Herdener, an Amazon spokesman, told the Times, ‘We are excited to make Kindle books available on a range of mobile phones. We are working on that now.’

What work is required? If they are proof-read well enough for Kindle — a name that always brings to mind the Nazi book burning — dropping them over to a mobile phone is a matter of moments work.

The Amazon spokesman did not provide any further details. Possible because there were none. But a fair bet is that we will be charged. And Google and Amazon will say, smugly, ‘Look how wonderful we are bringing out-of-copyright literature to the masses for only quite a small charge.’

Bah! Humbug!

What follows need not be read but is included for historical interest. I wrote to David Pogue of the New York Times Nov. 8, 2008. What follows is an edited version:

Google Books cannot continue as it is set up. It is not the legal side. It is the cost and the usefulness of the end product.

An Historical, Geographical, and Philosophical View of the Chinese Empire by William Winterbotham. I know this is not an easy book to handle being first published in 1795.The .pdf scan by Google itself is not that bad although fingers appear where they have been holding down pages and the map is a bad joke and in places the text runs diagonally across the page. OK, it is not great scanning but you can work around it.

When you scan at high speed you get these problems.

It is when Google turns the scanned text into ordinary text using OCR that farce becomes tragedy.

There are just over 1,200 mistakes in this book.

I am not talking about the spelling style of the time — vaft for vast — but error on error on error.

I attach a small part of the index to show the style.

To proofread such a book in any country is going to cost $500 and up. But unless some sort of correction is done the OCR version is effectively useless.

Google says it is scanning more than 3,000 books a day and it already has over one million books which are in the public domain. To make those useful as text will cost half a billion dollars.

Yet, if this is not done scanning the books with a high speed camera means that some pages are not scanned or scanned so they are unreadable.

Although, I think, this is generally understood what has not been made clear is that even when Google is dealing with an out of copyright book it may still be producing an unreadable mess. Possibly, but not probably usable in .pdf. A bad joke where it has been OCR’d.



Related Posts:

5 Responses to “Google’s book effort stuffs the copyright act”

  1. billmil:

    I’ve been surprised by the dearth of articles and blog entries questioning google’s ‘mobile e-book initiative’ and praising gutenberg.

    I’m a happy users of gutenberg and think it’s really helped the world by making the public-domain public.

    How does google add more value than gutenberg when it comes to public domain texts?

    I don’t get it.

  2. DannyGlover:

    <>

    It’s best not to comment on things you have no clue about.

  3. Jerome Garchik,Attorney:

    I am a SF Civil Rights attorney. Mr.Powell’s comments are unique take on the google book effort. Google gets current books to its data base directly now in e form. The google book settlement, see http://www.googlebooksettlement.com
    is immensely complex.See the 70 Q&As and 6 page claim form, and Dr. Darnton(Harvard) article in 2/l2/09 NY Review of Books. The libraries are very interested in this, of course, but so far this is the only piece I’ve seen to date that exposes technical quality issues in google’s omnibus scan efforts, J.Garchik

  4. Gareth Powell:

    It is a good question and deserves an answer. The Gutenberg initiative is almost totally an amateur effort. The important part is that it proof reads its texts and makes no let or hindrance on their use. It makes no suggestion as to copyright
    Apple started in that way in that it scanned in out-of-copyright texts but then found that scanning could be speedy and inexpensive. Proof-reading is not. So does Google add more value? Only in the sense that it scanned a lot more texts. Not in the sense that it proof read them to make them usable. It did not do that.
    It was thought that Google would brings its strength to both the scanning AND the proof-reading. It soon found out the proof-reading is slow and tedious and costs serious money.
    So it made a pact with the devil.
    Where a publisher had taken an out-of-copyright book and scanned it and proof-read it Google would consider it in a new form of copyright where you paid to have access to the new scan.
    So, you are right. Google has not added more value.
    I do not know who is meant in the phrase: ‘It’s best not to comment on thing you have no clue about.” If it is directed at me it is a total nonsense. If it is directed at Bilmil it is misdirected. He was not making a comment. He was asking a question.

  5. DannyGlover:

    It was directed at you.

    Your entire pedestrian rant is nonsensical drivel bordering on the unintelligible. Not only is it completely inaccurate with respect to Google’s scanning/proof reading (as you call it) process, but you seem to have in inordinate amount of anger towards this project. It has served me very well on several occasions. If you don’t like it, just don’t use it. I don’t see a gun put to your head. Stop blathering like a pissed off mental patient.

Leave a Reply:


Recent stories

Featured stories

RSS Windows news

RSS Mac news

RSS iPad news

RSS iPhone & Touch

RSS Mobile technology news

RSS Tablet computer news

RSS Buying guides

RSS PS3/Wii/Xbox 360

RSS Green technology

RSS Photography

Featured Content

Archives

Copyright © 2012 Blorge.com NS