May 20, 2022

Journal Article: “From Archive to Analysis: Accessing Web Archives At Scale Through A Cloud-Based Interface”

The article linked to below was recently published by the International Journal of Digital Humanities.


From Archive to Analysis: Accessing Web Archives At Scale Through A Cloud-Based Interface


Nick Ruest
York University

Samantha Fritz
University of Waterloo

Ryan Deschamps
University of Waterloo

Jimmy Lin
University of Waterloo

Ian Milligan
University of Waterloo


International Journal of Digital Humanities (2021)
DOI: 10.1007/s42803-020-00029-6


This paper introduces the Archives Unleashed Cloud, a web-based interface for working with web archives at scale. Current access paradigms, largely driven by the scope and scale of web archives, generally involve using the command line and writing code. This access gap means that subject-matter experts, as opposed to developers and programmers, have few options to directly work with web archives beyond the page-by-page paradigm of the Wayback Machine.


Drawing on first-hand research and analysis of how scholars use web archives, we present the interface design and underpinning architecture of the Archives Unleashed Cloud. We also discuss the sustainability implications of providing a cloud-based service for researchers to analyze their collections at scale.

Direct to Full Text Article

Direct to PDF Version
20 pages; PDF.

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.