December 12, 2017

A Google for Handwriting: Handwritten Text Recognition (HTR) Being Used to Digitize Cultural Heritage Materials in Sweden

From Uppsala University in Sweden:

When the university library digitises printed books from heritage collections, it uses software that converts the pages to digital text, known as Optical Character Recognition (OCR). The software interprets the printed information and makes it searchable. With handwriting, HTR technology – handwritten text recognition – is used instead. It is the development of this technology which is creating something of a race among researchers worldwide.

‘You want to be the first to find a program that works. If someone today had an algorithm to carry out large-scale digital searches of things like the collection of manuscripts in the Vatican Library, it would be worth a fortune. Whilst the market value is enormous, so is the scale of the task’, says Anders Brun, project manager at the Department of Information Technology.

[Clip]

The core of the work is all about text decoding, achieving a method via which the computer tries to interpret the digital image of the text. The researchers are trying to avoid text interpretation because handwritten text can look very different depending on who was holding the pen. Instead, they want to teach the computer to interpret the material.

‘Using expert knowledge, we try to give the computer the right answer for a small portion of the material and then automate this’, says Fredrik Wahlberg.

Read the Complete Article

Visit/Use the ALVIN Database/Portal of Cultural Heritage Materials Discussed in the Article ||| Learn More About Alvin

Gary Price About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.

Share