January 17, 2022

Milestones: The Wayback Machine Passes 400 Billion Archived Web Pages + A Tip For Wayback Users

Congratulations to Brewster Kahle and the entire Internet Archive team on reaching this milestone and for developing and maintaining The Wayback Machine, an ESSENTIAL research resource.

From the Internet Archive Blog:

The Wayback Machine, a digital archive of the World Wide Web, has reached a landmark with 400 billion webpages indexed. This makes it possible to surf the web as it looked anytime from late 1996 up until a few hours ago.

The remainder of the blog post includes some screen caps of web sites from 1996.

Quick Tip

Did you know that you can now have any open web accessible page or PDF (except those blocked by Robots.txt) crawled and indexed by the Wayback Machine on-demand?

1. To add content to the Wayback Machine simply head to: http://archive.org/web/

2. Look for the box labeled “Save Page Now”

3. Finally, paste in the URL of the page or PDF you want crawled and indexed.

In a few seconds the Wayback Machine will produce a direct URL to the archived material(with crawl date and time) either in your browser’s location bar or in a pop-up box. If the page can’t be crawled you’ll also be told.

Don’t Forget

The Internet Archive’s Archive-It service works with government, schools, libraries, non-profits, and others archiving web material. In many cases these archives are publicly accessible and unlike the Wayback Machine can be keyword searched. Browse and bookmark these collections here.

About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.