SUBSCRIBE
SUBSCRIBE
EXPLORE +
  • About infoDOCKET
  • Academic Libraries on LJ
  • Research on LJ
  • News on LJ
  • Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Libraries
    • Academic Libraries
    • Government Libraries
    • National Libraries
    • Public Libraries
  • Companies (Publishers/Vendors)
    • EBSCO
    • Elsevier
    • Ex Libris
    • Frontiers
    • Gale
    • PLOS
    • Scholastic
  • New Resources
    • Dashboards
    • Data Files
    • Digital Collections
    • Digital Preservation
    • Interactive Tools
    • Maps
    • Other
    • Podcasts
    • Productivity
  • New Research
    • Conference Presentations
    • Journal Articles
    • Lecture
    • New Issue
    • Reports
  • Topics
    • Archives & Special Collections
    • Associations & Organizations
    • Awards
    • Funding
    • Interviews
    • Jobs
    • Management & Leadership
    • News
    • Patrons & Users
    • Preservation
    • Profiles
    • Publishing
    • Roundup
    • Scholarly Communications
      • Open Access

June 17, 2025 by Gary Price

Report: Are AI Bots Knocking Cultural Heritage Offline?

June 17, 2025 by Gary Price

From the GLAM-E Lab/ Engelberg Center on Innovation Law & Policy, NYU School of Law:

In late 2024, isolated accounts began to emerge from individual online cultural heritage collections. Those stories described servers and collections straining – and sometimes breaking – under the load of swarming bots. The bots were reportedly scraping all of the data from collections to build datasets to train AI models.

Did these reports reflect the experience of most online collections? Were they outliers? Or early warning signs?

The GLAM-E Lab surveyed dozens of GLAM (Gallery, Library, Archive, and Museum) institutions to begin to answer those questions. This report, published in June of 2025, documents how institutions are straining under swarms of scraping bots, and how things may get worse before they get better.

[Clip]

In brief, we found:

  • Bots are widespread, although not universal. Of 43 respondents, 39 had experienced a recent increase in traffic. Twenty-seven of the 39 respondents experiencing an increase in traffic attributed it to AI training data bots, with an additional seven believing that bots could be contributing to the traffic.
  • This increase in traffic has been hard to anticipate because few respondents were actively tracking bot traffic prior to the bots triggering a crisis in their collection. Many respondents did not realize they were experiencing a growth in bot traffic until the traffic reached the point where it overwhelmed the service and knocked online collections offline.
  • Some respondents have been seeing an increase in bot traffic since 2021, while others did not experience their first spike until 2025.
  • Some bots clearly identify themselves, while others take a range of measures to hide their source.
  • When bots come, they tend to swarm for relatively brief periods of time. The frequency of these swarms may be increasing.
  • Robots.txt is not currently an effective way to prevent bots from overwhelming collections.
  • Respondents are deploying a range of home-grown and third-party firewall-based countermeasures to try to screen out bots based on IP address, geography, domain, and user agent string. Some of these efforts appear to be effective, although few are confident that they will be sustainable in the long term.
  • Respondents are reluctant to take more aggressive steps to move collections behind things like login screens for a variety of reasons, including concerns about how effective those measures will be in the medium term, that implementing those changes can have negative impacts on welcome users, and whether login-based restrictions run counter to their larger goal of making the collections easily available online.
  • Respondents worry that swarms of AI training data bots will create an environment of unsustainably escalating costs for providing online access to collections.

Direct to Full Text Report (by Michael Weinberg)

Direct to Full Text Report ||| Direct to Full Text Report (37 pages; PDF)

Media Coverage

  • AI Scraping Bots Are Breaking Open Libraries, Archives, and Museums (via 404 Media; by Emanuel Maiberg

“I’m confident in saying that this problem is widespread, and there are a lot of people and institutions who are worried about it and trying to think about what it means for the sustainability of these resources,” the author of the report, Michael Weinberg, told me. “A lot of people have invested a lot of time not only in making these resources available online, but building the community around institutions that do it. And this is a moment where that community feels collectively under threat and isn’t sure what the process is for solving the problem.”

Direct to Complete Article

Filed under: Archives and Special Collections, Data Files, Libraries, News, Patrons and Users, Reports

SHARE:

About Gary Price

Gary Price (gprice@gmail.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com.

ADVERTISEMENT

Archives

Job Zone

ADVERTISEMENT

Related Infodocket Posts

ADVERTISEMENT

FOLLOW US ON X

Tweets by infoDOCKET

ADVERTISEMENT

This coverage is free for all visitors. Your support makes this possible.

This coverage is free for all visitors. Your support makes this possible.

Primary Sidebar

  • News
  • Reviews+
  • Technology
  • Programs+
  • Design
  • Leadership
  • People
  • COVID-19
  • Advocacy
  • Opinion
  • INFOdocket
  • Job Zone

Reviews+

  • Booklists
  • Prepub Alert
  • Book Pulse
  • Media
  • Readers' Advisory
  • Self-Published Books
  • Review Submissions
  • Review for LJ

Awards

  • Library of the Year
  • Librarian of the Year
  • Movers & Shakers 2022
  • Paralibrarian of the Year
  • Best Small Library
  • Marketer of the Year
  • All Awards Guidelines
  • Community Impact Prize

Resources

  • LJ Index/Star Libraries
  • Research
  • White Papers / Case Studies

Events & PD

  • Online Courses
  • In-Person Events
  • Virtual Events
  • Webcasts
  • About Us
  • Contact Us
  • Advertise
  • Subscribe
  • Media Inquiries
  • Newsletter Sign Up
  • Submit Features/News
  • Data Privacy
  • Terms of Use
  • Terms of Sale
  • FAQs
  • Careers at MSI


© 2026 Library Journal. All rights reserved.


© 2022 Library Journal. All rights reserved.