May 21, 2022

Web Archiving: “Stories From the Past Web” (Preprint)

The following article (preprint) was recently shared on arXiv.


Stories From the Past Web


Yasmin AlNoamany
University of California, Berkeley

Michele C. Weigle
Old Dominion University

Michael L. Nelson
Old Dominion University


via arXiv


Archiving Web pages into themed collections is a method for ensuring these resources are available for posterity.

Services such as Archive-It exists to allow institutions to develop, curate, and preserve collections of Web resources. Understanding the contents and boundaries of these archived collections is a challenge for most people, resulting in the paradox of the larger the collection, the harder it is to understand. Meanwhile, as the sheer volume of data grows on the Web, “storytelling” is becoming a popular technique in social media for selecting Web resources to support a particular narrative or “story”. There are multiple stories that can be generated from an archived collection with different perspectives about the collection. For example, a user may want to see a story that is composed of the key events from a specific Web site, a story that is composed of the key events of the story regardless of the sources, or how a specific event at a specific point in time was covered by different Web sites, etc.

In this paper, we provide different case studies for possible types of stories that can be extracted from a collection. We also provide the definitions and models of these types of stories.

Direct to Full Text Article
15 pages; PDF.

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.