September 25, 2018

CLIR Releases New Report on the “Complexity and Challenges of Preserving Email” (Report from the Task Force on Technical Approaches for Email Archives)

From the Council on Library and Information Resources (CLIR):

Email is an increasingly important part of the historical record, yet it is particularly difficult to preserve, putting future access to this vast resource at risk. The Future of Email Archives looks at what makes email archiving so complex and describes emerging strategies to meet the challenge.

The report presents the findings of a yearlong investigation of the Task Force on Technical Approaches for Email Archives, sponsored by The Andrew W. Mellon Foundation and the Digital Preservation Coalition. The 19-member task force, comprising representatives from higher education, government, and industry, was co-chaired by Christopher Prom, of the University of Illinois at Urbana-Champaign, and Kate Murray, of the Library of Congress.

Addressing the challenges will require commitment and engagement from a wide variety of stakeholders. The task force proposes a series of short- and long-term actions for community development and advocacy, as well as for tool support, testing, and development.

The report is intended for the archival community, digital preservation professionals, technologists and software developers, commercial vendors, historians and scholars, institutional administrators, and funding agencies and foundations.

From the Report’s Executive Summary

While email archiving is still an emerging practice, this report demonstrates that archives are beginning to gain ground in approaching this most complex of problems. Some choose a simple ingest-and-store preservation approach, with no expectation of immediate usability. Others use emulation, allowing researchers to interact with email in its native environment. The most popular approach migrates and normalizes email to standards-based targets. Each of these approaches, which are not exclusive to one another and can be used in combination, has advantages and disadvantages. What they all share is intricacy. Email preservation is doable, but not yet done by enough archives to achieve our shared community goal to preserve correspondence, as we did for the paper-based archives that have facilitated untold historical insights.

If we wish to change that, interoperability is key. Just as the protocols that define the email environment are heavily standardized to facilitate interoperability across the diverse landscape of email, so too must the tools to preserve email be able to interact with one another across the lifecycle. A core set of tools, both commercial and open source, are in use within the cultural heritage community. In some cases, they need a little boost, especially to ensure more accurate and intuitive search, retrieval, and—when appropriate—removal and redaction when working with large corpora of email data. With additional investment, application programming interfaces (APIs) and other automated processes can help us link tools together, enabling more seamless workflows.

Direct to Full Text Report
132 pages; PDF.

Direct to Task Force on Technical Approaches for Email Archives

See Also: Introductory Blog Post by Co-Authors of Report

Presentation by Co-Authors of Report at Spring 2018 CNI Membership Meeting

Email Archives: Issues, Tools, and Gaps from CNI Vimeo Video Channel on Vimeo.

Gary Price About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.

Share