May 22, 2022

New: HathiTrust Research Center Semi-Annual Report (October 1, 2011 – March 31, 2012)

Direct to Complete Report

From the Introduction:

The HathiTrust Research Center (HTRC) had a productive 6 months as it works out core issues in Phase I of its development effort.  Milestone wise, we are looking forward to and planning for a public demonstration of functionality that is tentatively scheduled for June 2012 as is in accordance with the MOU between HathiTrust and HTRC.   Phase II in which HTRC Is operational is scheduled to begin date 01 Jan 2013.

In a striking accomplishment, HTRC is delighted to report that three legal agreements guiding the Center have been completed at the University level.  The MOU between Hathi Trust and the HathiTrust Research Center has gotten signatures at IU and UIUC and is with University of Michigan. The MOU between IU and UIUC has been fully executed. With the Google Agreement, UIUC and IU have each entered into an agreement with Google separately but the same terms.  The agreements have been signed at the University level and are with Google.

The remainder of the report includes discussions about:

  • Technical Accomplishments (API, Sandbox, OCR Detection Study, etc.)
  • Outreach (Google Digital Humanities Awards Recipient Interview Report, HTRC Web Site, etc.)

John Unsworth commissioned a study of award recipients of the Google Digital Humanities Award over the period 2010 – 2011.  The study, Google Digital Humanities Awards Recipient Interview Report, interviewed recipients of the Google awards to determine what difficulties the recipients encountered when working with the Google corpus. A recurring theme was weak metadata and poor OCR.

  • Initiatives
  • Governance

See Also: HathiTrust Research Center: Deliverables and Timeline

See Also: HTRC Facts and Links

The HTRC is a collaborative research center launched jointly by Indiana University and the University of Illinois, along with the HathiTrust Digital Library, to help meet the technical challenges of dealing with massive amounts of digital text that researchers face by developing cutting-edge software tools and cyberinfrastructure to enable advanced computational access to the growing digital record of human knowledge.

Direct to Complete Report

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.



  1. […]  HathiTrust Research Center Semi-Annual Report (October 1, 2011 – March 31, 2012) […]