January 27, 2022

40 Million Pages: NY Times Reports on the Harvard Law School Library’s “Free the Law” Digitization Project

From the The NY Times:

Now, in a digital-age sacrifice intended to serve grand intentions, the Harvard librarians are slicing off the spines of all but the rarest volumes and feeding some 40 million pages through a high-speed scanner. They are taking this once unthinkable step to create a complete, searchable database of American case law that will be offered free on the Internet, allowing instant retrieval of vital records that usually must be paid for.

“Improving access to justice is a priority,” said Martha Minow, dean of Harvard Law School, explaining why Harvard has embarked on the project. “We feel an obligation and an opportunity here to open up our resources to the public.”


While Harvard’s “Free the Law” project cannot put the lone defense lawyer or citizen on an equal footing with a deep-pocketed law firm, legal experts say, it can at least guarantee a floor of essential information. The project will also offer some sophisticated techniques for visualizing relations among cases and searching for themes.

Complete state results will become publicly available this fall for California and New York, and the entire library will be online in 2017, said Daniel Lewis, chief executive and co-founder of Ravel Law, a commercial start-up in California that has teamed up with Harvard Law for the project. The cases will be available at www.ravellaw.com. Ravel is paying millions of dollars to support the scanning. The cases will be accessible in a searchable format and, along with the texts, they will be presented with visual maps developed by the company, which graphically show the evolution through cases of a judicial concept and how each key decision is cited in others.

Read the Complete Article (approx. 1250 words)

See Also: From the Harvard Law School Library Update

We estimate there are 42,000 volumes and 40 million pages to scan. We’re scanning at a rate of 400–600,000 pages per week and thanks to work done in the pilot and proof-of-concept phases, we hit the 10-million-page milestone on September 17. Soon we will also be able to count the number of cases as individual digital objects as they are extracted from the scans. Our external vendor began sending us extracted and processed cases on October 1, and we can’t wait to see what the final number will be. We welcome librarian-tourists, so if you’d like to look at our scanning operation and learn more about it, please contact  Steve Chapman.

See Also: Formal News Release From Harvard Law Library

About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.