May 18, 2022

Columbia University Libraries: “Learning From the Crowd: The CULHebrewmss Twitter Bot”

From Columbia University Libraries:

In 2018, we decided to partner with a developer named Russel Neiss to create an automated Twitter account that randomly selects and posts images from the Hebrew manuscript collection on the Internet Archive. In doing so, we have not only made the manuscripts available to an audience that includes people who could not or would not step into RBML, but we have also learned a tremendous amount from scholars, students, and sharp-eyed Twitter scrollers.

Aside from small corrections (a language error here, a typographical error there, an error in metadata in a third place) of human error in cataloging, we have also gained so much more knowledge about the materials we care for and provide access to.  Catalogers don’t get to spend a lot of time reading an entire manuscript or book; their job is usually to provide clear (and often, due to time constraints, basic) information so a user can see if an item might be useful for a particular research topic. Putting the images with basic metadata on the web, however, invites everyone to come and read what they would like, and often yields new and interesting discoveries. We are very grateful when researchers share these findings, as we can share them in turn with future scholars and students!


So what have we learned from this experience? Oh, so much. But the most important lesson has been how critical it is to provide open data of and open access to our manuscripts. Posting high-quality images on the Internet Archive was only the first step. Creating a space where Internet users could “happen” across the materials has yielded exponentially more data for us. Data which, of course, we’ll be putting back out for people to read, study, and ultimately (we hope) send back to us significantly enhanced so we can continue the cycle!

Learn More, Read the Complete Post, View Images  (approx. 1200 words)

Direct to Columbia Hebrew Manuscripts (@CULHebrewMss) on Twitter

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.