SUBSCRIBE
SUBSCRIBE
EXPLORE +
  • About infoDOCKET
  • Academic Libraries on LJ
  • Research on LJ
  • News on LJ
  • Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Libraries
    • Academic Libraries
    • Government Libraries
    • National Libraries
    • Public Libraries
  • Companies (Publishers/Vendors)
    • EBSCO
    • Elsevier
    • Ex Libris
    • Frontiers
    • Gale
    • PLOS
    • Scholastic
  • New Resources
    • Dashboards
    • Data Files
    • Digital Collections
    • Digital Preservation
    • Interactive Tools
    • Maps
    • Other
    • Podcasts
    • Productivity
  • New Research
    • Conference Presentations
    • Journal Articles
    • Lecture
    • New Issue
    • Reports
  • Topics
    • Archives & Special Collections
    • Associations & Organizations
    • Awards
    • Funding
    • Interviews
    • Jobs
    • Management & Leadership
    • News
    • Patrons & Users
    • Preservation
    • Profiles
    • Publishing
    • Roundup
    • Scholarly Communications
      • Open Access

August 24, 2021 by Gary Price

Cool! The Open Library Introduces the Open Book Genome Project

August 24, 2021 by Gary Price

From The Open Library Blog:

In 1990, an international research group called the Human Genome Project (HGP) began sequencing the human genome to definitively uncover, “nature’s complete genetic blueprint for building a human being”. The result, which completed in 2003, was a compelling answer of, “what is a human?”.

Nine years later, Will Glaser & Tim Westergren drew inspiration from HGP and launched a similar effort called the Music Genome Project, using trained experts to classify and label music according to a taxonomy of characteristics, like genre and tempo. This system became the engine which powers song recommendations for Pandora Radio.

Circa 2003, Aaron Stanton, Matt Monroe, Sidian Jones, and Dan Bowen adapted the idea of Pandora to books, creating a book recommendation service called BookLamp. Under the hood, they devised a Book Genome Project which combined computers and crowds to “identify, track, measure, and study the multitude of features that make up a book”.

Sadly, the project did not release their data, was acquired by Apple in 2014, and subsequently discontinued. But they left an exciting treasure map for others to follow.

[Clip]

Over the last several months, we’ve been talking to communities, conducting research, and building experiments to contribute to a non-profit adaptation of these ideas called the Open Book Genome Project (OBGP).

[Clip]

OBGP hopes to achieve these things by employing a two pronged approach which readers may continue to learn about in following two blog posts:

  1. The Sequencer – a community-engineered bot which reads millions of Internet Archive books and extracts key insights for public consumption.
  2. Community Reviews – a new crowd-sourced book tagging system which empowers readers to collaboratively classify & share structured reviews of books.

Learn More, Read the Complete Post (approx. 730 words)

Learn More About the Open Book Genome Project

GSoC (Google Summer of Code) 2021: Making Books Lendable with the Open Book Genome Project

The purpose of the Open Book Genome Project to create “A Literary Fingerprint for Every Book” using the Internet Archive’s 5 million book digital library. A book’s fingerprint currently consists of 1gram (single word) and 2gram (two word) term frequency, Flesch–Kincaid readability level, referenced URLs, and ISBNs found within the book.

Try it out!

Anyone can try running the OBGP Sequencer on an Internet Archive open access book using the new OBGP Sequencer™ Google Colab Notebook. This interactive notebook runs directly within the browser, no installation required. If you have any questions, please email us.

If you are interested in seeing the source code or contributing check out the GitHub. If this project sounds fascinating to you and you’d like to learn more or keep the project going, please talk to us!

The purpose of the Open Book Genome Project to create “A Literary Fingerprint for Every Book” using the Internet Archive’s 5 million book digital library. A book’s fingerprint currently consists of 1gram (single word) and 2gram (two word) term frequency, Flesch–Kincaid readability level, referenced URLs, and ISBNs found within the book.

Learn More, Read the Complete Post (approx. 1550 words)

ICYMI

See Also: The Open Library Project Introduces Community Reviews

Filed under: Data Files, Digital Collections, Interactive Tools, Libraries, News

SHARE:

About Gary Price

Gary Price (gprice@gmail.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com.

ADVERTISEMENT

Archives

Job Zone

ADVERTISEMENT

Related Infodocket Posts

ADVERTISEMENT

FOLLOW US ON X

Tweets by infoDOCKET

ADVERTISEMENT

This coverage is free for all visitors. Your support makes this possible.

This coverage is free for all visitors. Your support makes this possible.

Primary Sidebar

  • News
  • Reviews+
  • Technology
  • Programs+
  • Design
  • Leadership
  • People
  • COVID-19
  • Advocacy
  • Opinion
  • INFOdocket
  • Job Zone

Reviews+

  • Booklists
  • Prepub Alert
  • Book Pulse
  • Media
  • Readers' Advisory
  • Self-Published Books
  • Review Submissions
  • Review for LJ

Awards

  • Library of the Year
  • Librarian of the Year
  • Movers & Shakers 2022
  • Paralibrarian of the Year
  • Best Small Library
  • Marketer of the Year
  • All Awards Guidelines
  • Community Impact Prize

Resources

  • LJ Index/Star Libraries
  • Research
  • White Papers / Case Studies

Events & PD

  • Online Courses
  • In-Person Events
  • Virtual Events
  • Webcasts
  • About Us
  • Contact Us
  • Advertise
  • Subscribe
  • Media Inquiries
  • Newsletter Sign Up
  • Submit Features/News
  • Data Privacy
  • Terms of Use
  • Terms of Sale
  • FAQs
  • Careers at MSI


© 2026 Library Journal. All rights reserved.


© 2022 Library Journal. All rights reserved.