December 13, 2018

Internet Archive/Wayback Machine Has Now “Rescued” More Than 9 Million Broken Links on Wikipedia

From an Internet Archive Blog Post by Mark Graham:

As part of the Internet Archive’s aim to build a better Web, we have been working to make the Web more reliable — and are pleased to announce that 9 million formerly broken links on Wikipedia now work because they go to archived versions in the Wayback Machine.

For more than 5 years, the Internet Archive has been archiving nearly every URL referenced in close to 300 wikipedia sites as soon as those links are added or changed. At the rate of about 20 million URLs/week.

[Clip]

To date we have successfully used IABot to edit and “fix” the URLs of nearly 6 million external references that would have otherwise returned a 404. In addition, members of the Wikipedia community have fixed more than 3 million links individually. Now more than 9 million URLs, on 22 Wikipedia sites, point to archived resources from the Wayback Machine, and other web archive providers.

One way to measure the real-world benefit of this work is by counting the number of click-throughs from Wikipedia to the Wayback Machine. During a recent 10 day period, the Wikimedia Foundation measured external link click-throughs and found that, by far, the most popular external destination was the Wayback Machine. Three-times the next most popular site, books.google.com In real numbers, on average, more than 25,000 clicks/day were made from the English Wikipedia to the Wayback Machine.

[Clip]

What is next?

We will expand our efforts to check and edit more Wikipedia sites and increase the speed which we scan those sites and fix broken links.

We will improve our processes to archive externally referenced resources by taking advantage of the Wikimedia Foundation’s new “EventStreams” web service.

We will explore how we might expand our link checking and fixing efforts to other media and formats, including more web pages, digital books and academic papers.

We will investigate and experiment with methods to support authors and editors use of archived resources (e.g. using Wayback Machine links in place of live-web links).

We will continue to work with the Wikimedia Foundation, and the Wikipedia communities world-wide, to advance tools and services to promote and support the use of persistently available and reliable links to externally referenced resources.

Learn More: Charts and More Info in the Complete IA Blog Post by Mark Graham

Gary Price About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.

Share