October 31, 2020

Report: “The Wayback Machine and Cloudflare Want to Backstop the Web”

From an Internet Archive Blog Post by Mark Graham:

Cloudflare and the Internet Archive are now working together to help make the web more reliable. Websites that enable Cloudflare’s Always Online service will now have their content automatically archived, and if by chance the original host is not available to Cloudflare, then the Internet Archive will step in to make sure the pages get through to users.

Cloudflare has become core infrastructure for the Web, and we are glad we can be helpful in making a more reliable web for everyone.

[Clip]

We archive URLs that are identified via a variety of different methods, such as “crawling” from lists of millions of sites, as submitted by users via the Wayback Machine’s “Save Page Now” feature, added to Wikipedia articles, referenced in Tweets, and based on a number of other “signals” and sources, such multiple feeds of “news” stories.

An additional source of URLs we will preserve now originates from customers of Cloudflare’s Always Online service. As new URLs are added to sites that use that service they are submitted for archiving to the Wayback Machine. In some cases this will be the first time a URL will be seen by our system and result in a “First Archive” event.

Read the Complete Blog Post

More From WIRED:

“We’d just like to make the web more reliable,” [Brewster] Kahle says. “We want a robust infrastructure out there and we can be part of it, but we’re not all of it. We want multiple participants to be working together in all different ways. We would not be a very good content distribution network and maybe Cloudflare wouldn’t necessarily be the best archive of the web.”

[Clip]

The Wayback Machine’s [Mark] Graham emphasizes, though, that ultimately any collaboration or project must serve the Internet Archive’s core mission. “We’re always on the hunt for more ways we can do a better job of archiving more of the public web,” he says. “This is another source of web resources for us to preserve and make available—hopefully forever, certainly for our lifetimes. As long as we’re around we’re going to keep this thing up.”

Read the Complete Article (approx. 640 words)

About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.

Share