Google Books and the Beanstalk

By: Jessica Stephens

Google Books is a literary search engine meant for research purposes. The two-year project was officially launched in 2004. Deals were made between Google and five libraries, the University of Michigan, Harvard University, Stanford University, Oxford University, and the New York Public Library, to invigorate a digitized database of works (9). Google Books was initially seen as the holy grail of literature access and the preservation of cultural heritage through its revolutionary idea of a universal library. However, the project soon became a heated controversy, and the image of Google went from saint to sinner. Is Google Books the greedy monstrosity the world of academia proclaimed it to be, or more so, a technology giant that was taken for a ride? Possible copyright infringement and the proposed threat to libraries are both factors weighed by opposers and supporters of Google Books. The proposed process of Google Books was simple. Using their own equipment, software, and algorithms, Google digitized collections loaned by libraries, loaded copies into a non-public database, and returned the original materials. Users of Google Books would be able to locate works referencing inputted keywords, seeing only snippets. The purpose was claimed to be for research and reference. Users would be able to quickly gain access to titles of relevant materials and data for mining. Unfortunately, a 10-year class action lawsuit filed by the Authors Guild in 2005 soon followed. The Authors Guild was quick to claim victimhood in light of copyright law, even though infringement was repeatedly denied to be present (10).

Text shown to users filter through a process called "blacklisting," and algorithms for the randomization of snippets proved resistant to manipulation, with only a possible 16% of a book prone to searchability. In cases where a snippet could replace the need for a text in its entirety, like a haiku or thesaurus, Google denied users access (10). However, armed with fallacious speculation, the Authors Guild attempted to stretch the dimensions of common sense, unsuccessfully convincing the court of Google's inevitable profit-driven domination of the online search market, indirectly profiting from the digitized works. Copyright law was not created to bar anyone from transforming another's work or profiting from it. Copyright reserves a creator's right to control and gain from their own creative expression. Google is a multi-billion dollar giant. Potential profit and losses will always be a factor in every decision. However, Google did not directly profit from the project and offered compensation for digitized works under copyright. Responding to mass interest in the lawsuit, the Department of Justice intervened, vocalizing their concerns about the destruction Google Books could cause. The DOJ mirrored the speculative objections made by academics, librarians, and, unsurprisingly, Google's competitors, such as Amazon, Yahoo!, and Microsoft. Creating a monopoly with the digitized library project was at the forefront of allegations (6). Robert Darnton, Harvard library president until 2016, claimed Google Books presented a possible danger of Google controlling digitized literature and the potential of price gouging subscription services to libraries (8). Another concern was the issue of orphan works, out-of-print material with an unlocatable copyright owner, being solely exploited by Google. Searching for every possible copyright owner of unclaimed texts would be an inefficient use of resources and time. Digitizing orphan works reintroduces them to readers and researchers, giving life back to a book that would have otherwise been left to collect dust. With the exception of unclaimed texts, there is no kind of exclusivity granted to Google. While the multi-billion dollar corporation has an advantage over competitors, it does not qualify as a de facto monopoly. Google was the first with the willpower and the means to bring life to the idea of a universal digital library. The financial resources necessary to successfully digitize millions of books from multiple corners of the world can only come from a wealthy body, such as Google. Ironically, this same necessary power and innovation fueled the fearful speculation of potential corporate tyranny.

Censorship is still an issue. Aggressive persuasion of groups or government bodies to censor texts in the Google Books database and the potential abidance of the corporation instills a familiar threat of oppression (4). An unrelated case of censorship in 2013, under government duress, the Canadian Department of Fisheries and Oceans had closed seven out of its nine libraries. Responses to questions about the whereabouts of materials, some dating back to 1800, were ambiguous and claimed no intelligent tracking process. The Eric Marshall Library was once “one of the finest environmental science and freshwater collections in the world." However, only 5-6% of materials from all seven libraries were deemed important enough to digitize (11). The threat and potential of censorship is very real. However, fear is not a viable reason to thwart advancement and the potential gains from a database such as Google Books. Digitizing collections protects them from disaster, natural aging, and brute censorship, such as the complete destruction of the University of Alabama Library by Union troops during the civil war (3), “acid-rich paper [volumes] crumbling into dust” (9), and the case of a 7.9 magnitude earthquake causing the fire that destroyed 700,000 texts at the Tokyo Imperial University Library in 1923 (1). Library collections not only house opinion, fact, and fiction. They are also time capsules of information and expression, reflecting the thoughts, policies, and turning points of past societies.

Libraries are not only information centers, but also help preserve cultural heritage. A part of Germany’s heritage was forever lost after a fire took 70,000 16th-17th century works at the Anna Amalia Library founded 1691 in Weimar, Germany (2). Digitizing collections is time-consuming and expensive. By participating in Google Books, libraries are not only contributing to the accessibility of knowledge, but also acquiring access to a vault of digital copies of their works. However, error and valuing time over quality in the scanning process has led to the degradation of an unknown fraction of the database (7). Google Books currently claims 40 million books (6). With its own scanning stations and each page turned by a human hand, on average 1,000 pages can be copied in an hour. Google is only 89,864,880 copies short of its ambitious goal to digitize the world’s collection (8). Human error is inevitable in all endeavors. With time understandably being a high priority, quality will unfortunately sometimes waver. Incorrectly scanned volumes can be fixed by the institutions who own the original copies and with other digital databases being available to the public other than Google Books, this setback does not forecast an overall literary catastrophe.

Google Books had appeared to shut down after the lawsuit ordeal. However, with the 2023 collaboration with the National Library of Israel, evidence points to the project still moving forward (6). Through accusations of copyright infringement and fallacious speculation, the project has retained its fair use and rightsholders have the ability to control how much of their works can be shown or be excluded from the database altogether. The gains outweigh the losses concerning the impact on libraries, academics, and the general public as it pertains to quality, quantity, accessibility, and equity. Hacking into the database is a hypothetical possibility, but is unlikely as Google Books is protected by the same security as Google is (10). Snobbish braggarts, corporate competitors, and stubborn individuals fearful of a Brave New World takeover seems to be the loudest opposers. As countries see the value in digitization and seek ways to bridge the digital divide, while nothing is guaranteed, Google Books is one of many ideas in a world where complaints outnumber solutions. As the priest James Keller said, “A candle loses nothing by lighting another.”

Works Cited

“Burnt books: The British Academy and the restoration of two academic libraries.” British Academy Review, no. 29, January 2017.
Grieshaber, Kirsten. “Literary Treasures Lost in Fire at German Library.” New York Times, 2004.
Hubbs, G. Ward. “Dissipating the Clouds of Ignorance: The First University of Alabama Library, 1831-1865.” Libraries and Culture, vol. 27, no. 1, 1992.
“IFLA Position on the Google Book Settlement.” International Federation of Library Association and Institutions, August 2009.
Lao, Marina. “The Perfect is the Enemy of the Good: The Antitrust Objections to the Google Books Settlement.” American Bar Association, 2012.
Marini, Ari. “How the Google Books team moved 90,000 books across a continent.” The Keyword, 2023.
Musto, Ronald. “Google Books Mutilates the Printed Past.” The Chronicle of Higher Education, 2009.
Somers, James. “Torching the Modern Day Library of Alexandria.” The Atlantic, 2017.
“UC libraries partner with Google to digitize books.” University of California, 2006. Press Release.
United States Court of Appeals for the Second Circuit. Authors Guild v. Google, Inc. 13-4829-cv, December 3, 2014.
Zeffiro, Andrea. “‘A monopoly of knowledge’: The dissolution of the libraries of Fisheries and Oceans Canada.” The Harper Record 2008-2015. Canadian Centre for Policy Alternatives.

Search This Blog

Google Books and the Beanstalk

Labels

Comments

Post a Comment

Popular posts from this blog

Serving Diverse Needs With Assistive Technology

Mobile Technology by Rachel Schneider

Four Blog Post

The Digital Divide By: Rachel Schneider

Introduction from Kiera

Where's a Clue: The New Incarceration Deal and rural America

Assistive Technology by: Rachel Schneider

Gadgets by Mayada TC Leonard

Gadget: Archive by Rachel Schneider

Serving Diverse Needs With Assistive Technology

Mobile Technology by Rachel Schneider