Mass Digitization: Aptara Digitizing 730,000 Pages of Content for SAGE’s New eBook Platform"
Aptara announces the award of a 730,000 page digitization project from SAGE. Within 8 months, Aptara will transform a significant collection of legacy print titles into XML format for posting to SAGE’s eBook platform.
Converting to the flexible digital format of XML gives SAGE the ability to easily repurpose their content at any time, for any other delivery platform or device. Working from a mix of print and PDF source files, Aptara is using a customer-defined DTD spec for coding and generating the XML. All print files must first be converted to a digital format using a double-key and compare methodology in combination with OCR (optical character recognition) to ensure an exact replica.
John Shaw, Executive Director of Publishing Technologies at SAGE, adds that Aptara is currently digitizing more than 3000 pages a day.
Filed under: Awards, Digital Preservation, News, Publishing

About Gary Price
Gary Price (gprice@gmail.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com.