SUBSCRIBE
SUBSCRIBE
EXPLORE +
  • About infoDOCKET
  • Academic Libraries on LJ
  • Research on LJ
  • News on LJ
  • Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Libraries
    • Academic Libraries
    • Government Libraries
    • National Libraries
    • Public Libraries
  • Companies (Publishers/Vendors)
    • EBSCO
    • Elsevier
    • Ex Libris
    • Frontiers
    • Gale
    • PLOS
    • Scholastic
  • New Resources
    • Dashboards
    • Data Files
    • Digital Collections
    • Digital Preservation
    • Interactive Tools
    • Maps
    • Other
    • Podcasts
    • Productivity
  • New Research
    • Conference Presentations
    • Journal Articles
    • Lecture
    • New Issue
    • Reports
  • Topics
    • Archives & Special Collections
    • Associations & Organizations
    • Awards
    • Funding
    • Interviews
    • Jobs
    • Management & Leadership
    • News
    • Patrons & Users
    • Preservation
    • Profiles
    • Publishing
    • Roundup
    • Scholarly Communications
      • Open Access

September 23, 2011 by Gary Price

Tech Article: "Toward free and searchable historical census images"

September 23, 2011 by Gary Price

Title: “Toward free and searchable historical census images”

By:
Kenton McHenry, Luigi Marini, Mayank Kejriwal, Rob Kooper
National Center for Supercomputing Applications
University of Illinois at Urbana-Champaign
Urbana, IL

and

Peter Bajcsy

National Institute of Standards and Technology

Gaithersburg, MD

From the SPIE Newsroom:

In summary, our hybrid automation/crowd-sourcing approach aims to provide search capabilities over the image-based census data, potentially from the day the images are released. However, general difficulties in automating handwriting recognition will limit its accuracy. Incorporation of passive and active crowd-sourcing elements will improve the accuracy of our systems over time. We are currently working on a number of challenges, including further pre-processing of form cells to remove noise. Our next important stage will be to build an index of the ∼7 billion form cells, which is crucial for efficient access. However, of the word-spotting techniques we tested, the best results use a non-linear comparison that does not lend itself to indexing. We are currently investigating alternative methods that are indexable, as well as using high-performance computing resources to perform a one-time, large pre-processing step to hierarchically cluster the data (requiring 4.9×1019 comparisons). Finally, we will investigate how best to associate the passively crowd-sourced transcriptions with the results based on user behavior.

Filed under: Data Files, News

SHARE:

DigitizationU.S. Census

About Gary Price

Gary Price (gprice@gmail.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com.

ADVERTISEMENT

Archives

Job Zone

ADVERTISEMENT

Related Infodocket Posts

ADVERTISEMENT

FOLLOW US ON X

Tweets by infoDOCKET

ADVERTISEMENT

This coverage is free for all visitors. Your support makes this possible.

This coverage is free for all visitors. Your support makes this possible.

Primary Sidebar

  • News
  • Reviews+
  • Technology
  • Programs+
  • Design
  • Leadership
  • People
  • COVID-19
  • Advocacy
  • Opinion
  • INFOdocket
  • Job Zone

Reviews+

  • Booklists
  • Prepub Alert
  • Book Pulse
  • Media
  • Readers' Advisory
  • Self-Published Books
  • Review Submissions
  • Review for LJ

Awards

  • Library of the Year
  • Librarian of the Year
  • Movers & Shakers 2022
  • Paralibrarian of the Year
  • Best Small Library
  • Marketer of the Year
  • All Awards Guidelines
  • Community Impact Prize

Resources

  • LJ Index/Star Libraries
  • Research
  • White Papers / Case Studies

Events & PD

  • Online Courses
  • In-Person Events
  • Virtual Events
  • Webcasts
  • About Us
  • Contact Us
  • Advertise
  • Subscribe
  • Media Inquiries
  • Newsletter Sign Up
  • Submit Features/News
  • Data Privacy
  • Terms of Use
  • Terms of Sale
  • FAQs
  • Careers at MSI


© 2026 Library Journal. All rights reserved.


© 2022 Library Journal. All rights reserved.