This is the story of a tragedy: a step forward for the academic community that was resisted and then blocked. It’s difficult to imagine, but Along Came Google (Princeton, 2021) is as gripping as any thriller – except for the conclusion.

The story is simple. Google proposed and started, with the agreement of some libraries and publishers, to create digitized versions of the world’s printed content, both out of print and in-copyright content. Their attempt came to nothing, prevented by a series of lawsuits and complaints from authors’ and from library associations. A proposed settlement was accepted by the various parties but not by the courts. As Deanna Marcum and Roger Schonfeld relate, quoting Atlantic journalist James Somers:

“When the most significant humanities project of our time was dismantled in court, the scholars, archivists and librarians who’d had a hand in its undoing breathed a sigh of relief, for they believed, at the time, that they had narrowly avoided disaster.”

How the scholarly community (or a good part of it) acted together to prevent something in their own best interests is described in this elegantly written account.

Why was the Google settlement not accepted?

Several reasons are given in the book. They include:

  • Suspicion of Google holding copies of content in copyright.
  • Neither public nor academic libraries saw themselves as having a national rather than a local role.
  • Complaints from Amazon (!) about one company having a monopoly of digital books

Most amazingly, even after the failure of the Google settlement, academic libraries resisted initiatives to create a common preservation archive. From the ominously titled “The Dark History of HathiTrust” by Alissa Centivany:

“From the librarians’ perspective, their hesitant, slow-moving, detail-oriented decision making process reflected a culture of collectivism and egalitarianism that was integral to the identity of librarians and which libraries had thrived upon for centuries. The approach reflected a sense of the gravity of their professional responsibility and respect for the status of libraries and librarians in society as the trusted stewards of our shared cultural record.”

That quotation left me speechless.

What is the situation today?

“Rather than a universal digital library, we have a potpourri of digital collections, with greater or lesser access” [page 188]

Academic libraries continue to provide local collections for private collections (their students and sometimes to their alumni). There is vast duplication in collections. It is not uncommon for Oxford and Cambridge to have several duplicate copies of the same title, one copy per college.

Out-of-print books are available via the Internet Archive and Hathi Trust. Very recent digital books are available mainly through Amazon – at a high price.

Publishers still have very poor records of their own intellectual property. Commercial publishers are interested in this year’s revenue, not their backlist, and certainly not their out-of-print content.

Copyright works that are out of print are in a kind of limbo. Publishers and libraries cannot afford to digitise them. Many of them are “orphan works” where the copyright status is uncertain. Although Google has digitised them, nobody has access to the full text.

There is no comprehensive collection of all open-access full-text book content with consistent metadata. There is not even a complete free-to-access collection of all OA full-text content.

When we look back to the late 19th-century, the struggle for public lending right and the dream of the (print) public library had been realized; it was a battle that had been won. But today, in the UK, that victory seems steadily to be unravelling. Public libraries are closing. Every major city in the UK seems to have a redundant Carnegie Library building.

HathiTrust now includes over 100 libraries. Nonetheless, it appears to be more focused on preservation than on dissemination.

One of the few successes described in the book is The Internet Archive, home of the Internet Library. They discussed collaboration with Google, but in the end went with their own digitisation system. I have yet to find a pre-1950 print book title that they haven’t got in their collection. And for books in copyright, they offer a simple lending service.

What of the future?

I don’t see much sign of any change  in the situation. Marcum and Schonfeld make a few suggestions, but without much enthusiasm. There is a mention of commercial attempts to create wide-ranging collections, for example ProQuest, but providing free public access to content has never been high on their agenda. The Library of Congress has never seen digitisation of its book content as a strategic objective.

The authors try to end on an upbeat tone, but I felt that their resigned view was already painfully apparent earlier in the book:

“In the final analysis, a robust, national digital library of the kind with which Google enticed us will be possible only when publishers and libraries and archives recognize that they are part of the same ecosystem and the survival of each of them is inextricably tied to the others” [page 159]

It’s a very sad end to an exciting story.