SUBSCRIBE
SUBSCRIBE
EXPLORE +
  • About infoDOCKET
  • Academic Libraries on LJ
  • Research on LJ
  • News on LJ
  • Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Libraries
    • Academic Libraries
    • Government Libraries
    • National Libraries
    • Public Libraries
  • Companies (Publishers/Vendors)
    • EBSCO
    • Elsevier
    • Ex Libris
    • Frontiers
    • Gale
    • PLOS
    • Scholastic
  • New Resources
    • Dashboards
    • Data Files
    • Digital Collections
    • Digital Preservation
    • Interactive Tools
    • Maps
    • Other
    • Podcasts
    • Productivity
  • New Research
    • Conference Presentations
    • Journal Articles
    • Lecture
    • New Issue
    • Reports
  • Topics
    • Archives & Special Collections
    • Associations & Organizations
    • Awards
    • Funding
    • Interviews
    • Jobs
    • Management & Leadership
    • News
    • Patrons & Users
    • Preservation
    • Profiles
    • Publishing
    • Roundup
    • Scholarly Communications
      • Open Access

November 2, 2019 by Gary Price

Video: “Perpetual Access Machines: Archiving Web-Published Scholarship at Scale” (A FORCE2019 Conference Presentation)

November 2, 2019 by Gary Price

The video embedded below was recorded at the FORCE2019 conference in Edinburgh, Scotland on October 16, 2019. Presentation slides are also available and linked below.

Title

Perpetual Access Machines: Archiving Web-Published Scholarship at Scale

Presenters

Jefferson Bailey
Director, Web Archiving & Data Services, Internet Archive

Bryan Newbold
Open Data Engineer, Web Archiving & Data Services, Internet Archive

Slides from the presentation are also available.

From the Conference Website

In 2018, the Internet Archive undertook a large-scale project to build as complete a collection as possible of scholarly outputs published on the web, as well as to improve the discoverability and accessibility of scholarly works archived as part of these global web harvests. This project involved a number of areas of work: targeted archiving of known OA publications (especially at-risk “long tail” publications); extraction and augmentation of bibliographic metadata and full text; integration and preservation of related identifier, registry, and aggregation services and datastores; partnerships with affiliated initiatives and joint service developments; and creation of new tools and machine learning approaches for identifying archived scholarly work in existing born-digital and web collections. The project also identified and archived associated research outputs such as blogs, datasets, code repositories and other secondary research objects. The beta API and public interface – code-named “fatcat” – can be found at https://fatcat.wiki/.

Project leads will talk about the project’s current status and upcoming work, focusing on content acquisition, indexing, discoverability, the role of machine learning, service provisioning, and their collaborative work with libraries, publishers, and non-profits. Conceptually, the project demonstrates that the scalability and technologies of “archiving the web” can facilitate automated ingest, enrichment, and dissemination strategies for a variety of web-published primary and secondary scholarly record types that have traditionally been collected via more custom and manual workflows. The project strategic goal is to provide open infrastructure for the perpetual discoverability of and access to archived scholarship.

See Also: Video Playlist: All FORCE2019 Presentations

See Also: 2019 FORCE2019 Presentation Slides

See Also: FORCE2019 Conference Program

Filed under: Companies (Publishers/Vendors), Data Files, Libraries, News, Open Access, Preservation, Video Recordings

SHARE:

About Gary Price

Gary Price (gprice@gmail.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com.

ADVERTISEMENT

Archives

Job Zone

ADVERTISEMENT

Related Infodocket Posts

ADVERTISEMENT

FOLLOW US ON X

Tweets by infoDOCKET

ADVERTISEMENT

This coverage is free for all visitors. Your support makes this possible.

This coverage is free for all visitors. Your support makes this possible.

Primary Sidebar

  • News
  • Reviews+
  • Technology
  • Programs+
  • Design
  • Leadership
  • People
  • COVID-19
  • Advocacy
  • Opinion
  • INFOdocket
  • Job Zone

Reviews+

  • Booklists
  • Prepub Alert
  • Book Pulse
  • Media
  • Readers' Advisory
  • Self-Published Books
  • Review Submissions
  • Review for LJ

Awards

  • Library of the Year
  • Librarian of the Year
  • Movers & Shakers 2022
  • Paralibrarian of the Year
  • Best Small Library
  • Marketer of the Year
  • All Awards Guidelines
  • Community Impact Prize

Resources

  • LJ Index/Star Libraries
  • Research
  • White Papers / Case Studies

Events & PD

  • Online Courses
  • In-Person Events
  • Virtual Events
  • Webcasts
  • About Us
  • Contact Us
  • Advertise
  • Subscribe
  • Media Inquiries
  • Newsletter Sign Up
  • Submit Features/News
  • Data Privacy
  • Terms of Use
  • Terms of Sale
  • FAQs
  • Careers at MSI


© 2026 Library Journal. All rights reserved.


© 2022 Library Journal. All rights reserved.