November 26, 2020

From The National Science Foundation: A Glimpse of the Archives of the Future

From a National Science Federation “Discovery” Article:

How does an archivist understand the relationship among billions of documents or search for a single record in a sea of data? With the proliferation of digital records, the task of the archivist has grown more complex. This problem is especially acute for the National Archives and Records Administration (NARA), the government agency responsible for managing and preserving the nation’s historical records.

At the end of President George W. Bush’s administration in 2009, NARA received roughly 35 times the amount of data as previously received from the administration of President Bill Clinton, which itself was many times that of the previous administration. With the federal government increasingly using social media, cloud computing and other technologies to contribute to open government, this trend is not likely to decline. By 2014, NARA is expecting to accumulate more than 35 petabytes (quadrillions of bytes) of data in the form of electronic records.

“The National Archives is a unique national institution that responds to requirements for preservation, access and the continued use of government records,” said Robert Chadduck, acting director for the National Archives Center for Advanced Systems and Technologies.

To find innovative and scalable solutions to large-scale electronic records collections, Chadduck turned to the Texas Advanced Computing Center (TACC), a National Science Foundation- (NSF) funded center for advanced computing research, to draw on the expertise of TACC’s digital archivist, Maria Esteva, and data analysis expert, Weijia Xu.

[Clip]

Archivists spend a significant amount of time determining the organization, contents and characteristics of collections so they can describe them for public access purposes. “This process involves a set of standard practices and years of experience from the archivist side,” said Xu. “To accomplish this task in large-scale digital collections, we are developing technologies that combine computing power with domain expertise.”

Read the Complete Article

Several Additional Treemaps (As Seen at the Beginning of the Article) Are Available on This Page

About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.

Share