SUBSCRIBE
SUBSCRIBE
EXPLORE +
  • About infoDOCKET
  • Academic Libraries on LJ
  • Research on LJ
  • News on LJ
  • Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Libraries
    • Academic Libraries
    • Government Libraries
    • National Libraries
    • Public Libraries
  • Companies (Publishers/Vendors)
    • EBSCO
    • Elsevier
    • Ex Libris
    • Frontiers
    • Gale
    • PLOS
    • Scholastic
  • New Resources
    • Dashboards
    • Data Files
    • Digital Collections
    • Digital Preservation
    • Interactive Tools
    • Maps
    • Other
    • Podcasts
    • Productivity
  • New Research
    • Conference Presentations
    • Journal Articles
    • Lecture
    • New Issue
    • Reports
  • Topics
    • Archives & Special Collections
    • Associations & Organizations
    • Awards
    • Funding
    • Interviews
    • Jobs
    • Management & Leadership
    • News
    • Patrons & Users
    • Preservation
    • Profiles
    • Publishing
    • Roundup
    • Scholarly Communications
      • Open Access

December 1, 2014 by Gary Price

Digital Preservation: The Library of Congress Seeks Information About Web Harvesting, Formal RFI Published

December 1, 2014 by Gary Price

On November 26th, the Library of Congress published a RFI (request for information) regarding web harvesting.
From the Synopsis:

The Library of Congress, Office of Strategic Initiatives (OSI) is seeking information from potential contractors about how to best to design a requirement related to saving and reviewing information from the Internet. The Library is seeking information, e.g., current/existing commercial solutions, design solutions, etc., on how to best meet this web harvesting requirement.
This RFI is to determine if potential offerors can meet the Library’s technical and production requirements for harvesting web content and to receive feedback on pricing models and reasonable quality assurance. The Library is actively seeking suggested solutions and alternatives that will meet our requirements.
From the RFI:
Many of the activities of the digital lifecycle for harvested web content occur at the Library of Congress, including seed URL nomination, permissions gathering, scoping and preparation of a seed list, quality review, and public access to researchers. The Library’s web harvesting curator tools and infrastructure have been developed for the inputs and outputs of open source tools (Heritrix for harvesting, and Wayback Machine for access). The potential requirements described here are to support the Library’s large-scale, ongoing harvesting efforts, plus storage for the life of any potential contract, indexing for access, restricted access to the content for processing by Library staff, and transfer to the Library for long-term storage.
Although the following provides a general description of the Library’s potential requirements, the Library is actively seeking suggested alternatives to the requirements discussed below, where appropriate.

Direct to Complete RFI (22 pages; PDF
Full text is also embedded below.

From the Library of Congress: RFI Web Harvesting


See Also: Web Archiving In the United States: A 2013 Survey
October 2013. 25 pages; PDF.
Published by the National Digital Stewardship Alliance.

Filed under: Digital Preservation, Libraries, News, Preservation

SHARE:

About Gary Price

Gary Price (gprice@gmail.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com. Gary is also the co-founder of infoDJ an innovation research consultancy supporting corporate product and business model teams with just-in-time fact and insight finding.

ADVERTISEMENT

Archives

Job Zone

ADVERTISEMENT

Related Infodocket Posts

A New EPUB Reader For E-Books From the Library of Congress Open Access Books Collection 

From a Library of Congress Blog Post: The Open Access Books Collection on loc.gov includes approximately 6,000 contemporary open access e-books covering a wide range of subjects, including history, music, poetry, technology, and works ...

Panel Discussion Video Recording: "Internet Freedom: Information Communication, Accessibility and Archiving"

The panel discussion video recording embedded below from the Oxford Internet Institute (OII) was recorded on February 1, 2023.  Description This is a discussion on censorship-resistance, web archiving and ensuring ...

RLUK Releases Community-Driven Toolkit for the Development and Delivery of Virtual Reading Rooms (VRRs)

From RLUK (Research Libraries UK): The Virtual Reading Rooms (VRRs) Toolkit is a resource for all collection-holding institutions, including libraries, archives, and museums, which are interested in setting up a VRR consultation ...

Microsoft Bing to Rely on GPT-4, ChatGPT Mobile App Planned, Rumours Say; Senator Calls on Apple and Google...

Microsoft Bing to Rely on GPT-4, ChatGPT Mobile App Planned, Rumours Say (via The Decoder) & Microsoft Teams gets an AI upgrade with OpenAI’s GPT 3.5 (via The Decoder) Resources ...

Library of Congress Opens New Web Archive Collection Documenting Protests Against Racism & Learn About LC's Black History...

From the Library of Congress (Full Text of Announcement): A new web archive collection from the Library of Congress documents the civil unrest sparked by the police murder of George ...

AI: arXiv Announces New Policy on ChatGPT and Similar Tools

From an arXiv Blog Post: The recent release of AI technology that generates new text has raised serious questions among the research community. For one, “Can ChatGPT be named an ...

ResearchGate and De Gruyter Announce a New Content Syndication Partnership

From a Joint Statement (via De Gruyter): ResearchGate, the professional network for researchers, and De Gruyter, an independent academic publisher, have today announced a content syndication partnership that will see ...

EveryLibrary Releases 2022 Annual Report; ARL: Celebrating Black History Month 2023 & More News Headlines

ARL: Celebrating Black History Month 2023 EveryLibrary Releases 2022 Annual Report ||| Full Text Report Germany: DFG Launches Cooperation with the OAPEN Foundation IFLA: Applications for Public Library of the ...

Ithaka S+R Releases "A*CENSUS II: Archives Administrators Survey" Findings

From an Ithaka S+R Blog Post by the Report’s Author, Makala Skinner:  On Tuesday, January 31, we published the A*CENSUS II Archives Administrators Survey findings. The Archives Administrator Survey Report is ...

“Food is a Right: Libraries and Food Justice" (A New White Paper From the Urban Libraries Council)

From the Urban Libraries Council (ULC): The Urban Libraries Council (ULC) announces today the release of its latest white paper, “Food is a Right: Libraries and Food Justice,” which addresses ...

Standards: W3C Re-Launched as a Public-Interest Non-Profit Organization; eLife’s New Model: Open for Submissions; & More News Headlines

Annual Report 2022: Highlights from the Data Curation Network arXiv Announces New Policy on ChatGPT and Similar Tools (via arXiv Blog) COPE in 2023 (via Committee on Publication Ethics) eLife’s ...

Journal Article: "A Free Toolkit to Foster Open Access Agreements"

The article linked to below was today published by Insights. Title A Free Toolkit to Foster Open Access Agreements Authors Alicia Wise Information Power Lorraine Estelle Information Power Source Insights 36 ...

ADVERTISEMENT

FOLLOW US ON TWITTER

Tweets by infoDOCKET

ADVERTISEMENT

This coverage is free for all visitors. Your support makes this possible.

This coverage is free for all visitors. Your support makes this possible.

Primary Sidebar

  • News
  • Reviews+
  • Technology
  • Programs+
  • Design
  • Leadership
  • People
  • COVID-19
  • Advocacy
  • Opinion
  • INFOdocket
  • Job Zone

Reviews+

  • Booklists
  • Prepub Alert
  • Book Pulse
  • Media
  • Readers' Advisory
  • Self-Published Books
  • Review Submissions
  • Review for LJ

Awards

  • Library of the Year
  • Librarian of the Year
  • Movers & Shakers 2022
  • Paralibrarian of the Year
  • Best Small Library
  • Marketer of the Year
  • All Awards Guidelines
  • Community Impact Prize

Resources

  • LJ Index/Star Libraries
  • Research
  • White Papers / Case Studies

Events & PD

  • Online Courses
  • In-Person Events
  • Virtual Events
  • Webcasts
  • About Us
  • Contact Us
  • Advertise
  • Subscribe
  • Media Inquiries
  • Newsletter Sign Up
  • Submit Features/News
  • Data Privacy
  • Terms of Use
  • Terms of Sale
  • FAQs
  • Careers at MSI


© 2023 Library Journal. All rights reserved.


© 2022 Library Journal. All rights reserved.