Nice! Milestones: Wayback Machine Now Home to 240,000,000,000 URLs and Improved Currency
Super exciting news from an essential resource.
Not only is the Wayback database larger but it’s also impressive and useful to see how current the index of pages has become. Wayback used to be 6,9,12 or more months delayed before a page could be accessed.
From a Blog Post by Brewster Kahle:
[Yesterday] we updated the Wayback Machine with much more data and some code improvements. Now we cover from late 1996 to December 9, 2012 so you can surf the web as it was up until a month ago. Also, we have gone from having 150,000,000,000 URLs to having 240,000,000,000 URLs, a total of about 5 petabytes of data.
Brewster also shared some useage stats:
This database is queried over 1,000 times a second by over 500,000 people a day helping make archive.org the 250th most popular website.
Kahle also pointed out that a small amount of data is temporarily missing from the updated version of Wayback but available via another interface.
The updated version does have at least one known issue – there is a small amount of older content missing from the index, and it will take us another month or two to sort out that problem. In the mean time, you can still visit the previous version of the Wayback with that content.
The Wayback Machine is not keyword searchable but topic or organization focused collections from “Archive-It”, a fee-based service from the IA, ARE keyword searchable. As of today more than 1800 archives can be accessed online.
Filed under: Archives and Special Collections, Data Files, News
About Gary Price
Gary Price (firstname.lastname@example.org) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com. Gary is also the co-founder of infoDJ an innovation research consultancy supporting corporate product and business model teams with just-in-time fact and insight finding.