Aptara announces the award of a 730,000 page digitization project from SAGE. Within 8 months, Aptara will transform a significant collection of legacy print titles into XML format for posting to SAGE’s eBook platform.
Converting to the flexible digital format of XML gives SAGE the ability to easily repurpose their content at any time, for any other delivery platform or device. Working from a mix of print and PDF source files, Aptara is using a customer-defined DTD spec for coding and generating the XML. All print files must first be converted to a digital format using a double-key and compare methodology in combination with OCR (optical character recognition) to ensure an exact replica.
John Shaw, Executive Director of Publishing Technologies at SAGE, adds that Aptara is currently digitizing more than 3000 pages a day.