From the Associated Press:
Capturing the unruly, ever-changing Internet is like trying to pin down a raging river.
But the British Library is going to try.
For centuries the library has kept a copy of every book, pamphlet, magazine and newspaper published in Britain. Starting Saturday, it will also be bound to record every British website, e-book, online newsletter and blog in a bid to preserve the nation’s “digital memory.”
“Stuff out there on the Web is ephemeral,” said Lucie Burgess, the library’s head of content strategy. “The average life of a web page is only 75 days, because websites change, the contents get taken down.
Like reference collections around the world, the British Library has been attempting to archive the Web for years in a piecemeal way and has collected about 10,000 sites. Until now, though, it has had to get permission from website owners before taking a snapshot of their pages.
That began to change with a law passed in 2003, but it has taken a decade of legislative and technological preparation for the library to be ready to begin a vast trawl of all sites ending with the suffix .uk.
An automated web harvester will scan and record 4.8 million sites, a total of 1 billion web pages. Most will be captured once a year, but hundreds of thousands of fast-changing sites such as those of newspapers and magazines will be archived as often as once a day.
The library plans to make the content publicly available by the end of this year.
Read the Complete Article
See Also: Web Archiving (via The British Library)
Includes info about legal deposit of UK online publications.
See Also: UK Web Archive (Search)