SUBSCRIBE
SUBSCRIBE
EXPLORE +
  • About infoDOCKET
  • Academic Libraries on LJ
  • Research on LJ
  • News on LJ
  • Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Libraries
    • Academic Libraries
    • Government Libraries
    • National Libraries
    • Public Libraries
  • Companies (Publishers/Vendors)
    • EBSCO
    • Elsevier
    • Ex Libris
    • Frontiers
    • Gale
    • PLOS
    • Scholastic
  • New Resources
    • Dashboards
    • Data Files
    • Digital Collections
    • Digital Preservation
    • Interactive Tools
    • Maps
    • Other
    • Podcasts
    • Productivity
  • New Research
    • Conference Presentations
    • Journal Articles
    • Lecture
    • New Issue
    • Reports
  • Topics
    • Archives & Special Collections
    • Associations & Organizations
    • Awards
    • Funding
    • Interviews
    • Jobs
    • Management & Leadership
    • News
    • Patrons & Users
    • Preservation
    • Profiles
    • Publishing
    • Roundup
    • Scholarly Communications
      • Open Access

October 20, 2021 by Gary Price

Report: “New Algorithm Searches Historic Documents To Identify Noteworthy People”

October 20, 2021 by Gary Price

From the University of Buffalo:

Old newspapers provide a window into our past, and a new algorithm co-developed by a School of Management researcher is helping turn those historic documents into useful, searchable data.

Published in Decision Support Systems, the algorithm can find and rank people’s names in order of importance from the results produced by optical character recognition (OCR), the computerized method of converting scanned documents into text that is often messy.

“It’s a known fact that when OCR software is run, very often the text gets garbled,” says Haimonti Dutta, assistant professor of management science and systems. “With old newspapers, books and magazines, problems can arise from poor ink quality, crumpled or torn paper, or even unusual page layouts the software isn’t expecting.”

To develop the algorithm, researchers partnered with the New York Public Library (NYPL) and analyzed more than 14,000 articles from New York City newspaper The Sun published during November and December 1894. The NYPL has scanned more than 200,000 newspaper pages as part of Chronicling America, an initiative of the National Endowment for the Humanities and the Library of Congress that is working to develop an online, searchable database of historical newspapers from 1777 to 1963.

[Clip]

Dutta says their process has wide-reaching implications for discovering important people throughout history.

“We recently used this technique on African American literature from the Civil War to learn more about the important people during the era of slavery,” Dutta says. “Going forward, we’ll be expanding the technique to examine relationships between people and build out the social networks of the past.”

Learn More, Read the Complete Article

Filed under: Data Files, Journal Articles, Libraries, Management and Leadership, News, Public Libraries

SHARE:

About Gary Price

Gary Price (gprice@gmail.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com.

ADVERTISEMENT

Archives

Job Zone

ADVERTISEMENT

Related Infodocket Posts

ADVERTISEMENT

FOLLOW US ON X

Tweets by infoDOCKET

ADVERTISEMENT

This coverage is free for all visitors. Your support makes this possible.

This coverage is free for all visitors. Your support makes this possible.

Primary Sidebar

  • News
  • Reviews+
  • Technology
  • Programs+
  • Design
  • Leadership
  • People
  • COVID-19
  • Advocacy
  • Opinion
  • INFOdocket
  • Job Zone

Reviews+

  • Booklists
  • Prepub Alert
  • Book Pulse
  • Media
  • Readers' Advisory
  • Self-Published Books
  • Review Submissions
  • Review for LJ

Awards

  • Library of the Year
  • Librarian of the Year
  • Movers & Shakers 2022
  • Paralibrarian of the Year
  • Best Small Library
  • Marketer of the Year
  • All Awards Guidelines
  • Community Impact Prize

Resources

  • LJ Index/Star Libraries
  • Research
  • White Papers / Case Studies

Events & PD

  • Online Courses
  • In-Person Events
  • Virtual Events
  • Webcasts
  • About Us
  • Contact Us
  • Advertise
  • Subscribe
  • Media Inquiries
  • Newsletter Sign Up
  • Submit Features/News
  • Data Privacy
  • Terms of Use
  • Terms of Sale
  • FAQs
  • Careers at MSI


© 2026 Library Journal. All rights reserved.


© 2022 Library Journal. All rights reserved.