July 4, 2020

Web Archives: “These Crusaders Want To Preserve ‘Human Culture’ Online. Their Latest Target: Yahoo Groups”

From The Washington Post:

The help page entry looked like a routine update: “Understand what’s changing in Yahoo Groups.”

But the understated announcement from Yahoo had big implications for fans of its popular forums that once boasted more than 100 million users. Nobody would be able to post anything new on the site as of Oct. 28. And everything accumulated over nearly two decades on Yahoo Groups would be “permanently removed” on Dec. 14, the company said.

[Clip]

The protests and pleas for more time were just starting when Jason Scott took to Twitter to register his utter lack of surprise over the fate of Yahoo’s sprawling chitchat of neighborhoods, businesses, addicts in recovery and birdwatchers.

The team of volunteers Scott founded — “rogue archivists, programmers, writers and loudmouths dedicated to saving our digital heritage” — has spent a decade hopping from one online obliteration to the next, capturing whatever they can in a public repository called the Wayback Machine. The Archive Team, as his group is known, keeps a “Deathwatch” of websites in various stages of shutdown (“Likely to Die,” “Dying,” “Dead as a Doornail”); Yahoo discards feature prominently.

[Clip]

Mark Graham, director of the Wayback Machine, says the tool comprises nearly 400 billion Web pages and accounts for about half of the nonprofit Internet Archive’s 60 petabytes of stored content. That is a lot of data. A petabyte, equal to more than a million gigabytes, is sometimes equated to 10 million filing cabinets of text.

But in the grand scheme of the Internet, it is also small. Yahoo Groups could clock in at several petabytes, Graham guesses — though he compares the act of estimation to walking into a library you can’t see the end of and trying to guess how many words it contains.

Read the Complete Article (approx. 1500 words)

Gary Price About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.

Share