January 24, 2022

News from the UK: “Web Archive Project to Explore Pandemic Misinformation”

From the National Library of Scotland:

A partnership led by the National Library of Scotland has secured £230,958 [appx. $319,000/USD] funding from the Wellcome Trust to archive and explore online resources about health information and the Covid-19 pandemic.

Titled ‘The Archive of Tomorrow: Health Information and Misinformation in the UK Web Archive’, the project will examine how we archive websites and other online information about health.

Joseph Marshall, the Library’s Associate Director of Collections Management said:

‘The Covid-19 pandemic has contributed to a global crisis of information vs misinformation which has played out mostly online. Government and medical websites have changed on a daily basis as new information emerges, and there has been a massive proliferation of opining on social media and other online publications about Coronavirus.

‘Health advice, data and scientific evidence have been contested, revised, used and misused with dramatic and sometimes tragic consequences, and yet the digital record of this is fragile and difficult to access. How easy will it be in a few years’ time to source the tweets, blogs and news stories from the past 18 months and will we be able to make sense of it all? These are the questions we’ll be asking.’

Alongside the National Library of Scotland, project partners include Cambridge University Library, Edinburgh University Library and Bodleian Libraries, Oxford, with key roles based at all institutions that will form a network of expertise and investigation. The British Library will play a key supporting role in the project.

Daryl Green, Head of Special Collections and Deputy Head of Centre for Research Collections (CRC) at the University of Edinburgh, says of the project:

‘The Covid-19 pandemic has created not only a health crisis but also a crisis of trust in sources of information. The CRC has well-established experience preserving collections to ensure their integrity and authenticity. As more information sources move online, we continue to commit to the same values and standards. This project provides an opportunity to explore best practice in preserving information published on the web and how to support different research approaches.’

The UK Web Archive is a partnership of UK legal deposit libraries. Legal deposit libraries are entitled by law to collect anything published in the UK. The UK Web Archive collects and preserves UK-related web content, including large-scale automated capture, curated collections, and webpages nominated by a range of partners and stakeholders. The partnership has to date archived billions of webpages.

The ‘Archive of Tomorrow’ project will preserve 10,000 sites relating to health — both official and unofficial — and use this collection to make web archives more accessible for researchers and members of the public. Even if a contested website or webpage has been deleted, it’s possible it can still be archived through this project, so it can be included in research on the proliferation of misinformation.

Joseph Marshall adds:

‘Libraries and archives have always striven to collect the stories of our times, and this is more important than ever when information is literally a matter of life and death. We will ensure a wide representation of diverse and otherwise un-collected sources. And we will tackle some thorny questions including how we can ethically capture and describe misinformation and fake news for posterity. It’s our hope that a project like this will help us make sense of events of the past 18 months, and ultimately improve our ability to interrogate factual information and misinformation in the future.’

‘The Archive of Tomorrow’ 14-month pilot project will start in December 2021, which will involve a dedicated project team. Specific aims of the project are to curate a new collection of websites within the UK Web Archive under the theme of ‘Health and Misinformation’, use the collection to explore options for metadata, computational analysis, ethics and rights issues, build a research network across a range of disciplines, make recommendations to make web archives more representative, inclusive and open for health research.

The project network already includes researchers from different disciplines from the Internet Archive, Digital Preservation Coalition, National Archives, National Records of Scotland and the Alan Turing Institute. However, the National Library of Scotland is actively seeking to grow this network.

Learn More, Read the Complete Announcement

About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.