SUBSCRIBE
SUBSCRIBE
EXPLORE +
  • About infoDOCKET
  • Academic Libraries on LJ
  • Research on LJ
  • News on LJ
  • Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Libraries
    • Academic Libraries
    • Government Libraries
    • National Libraries
    • Public Libraries
  • Companies (Publishers/Vendors)
    • EBSCO
    • Elsevier
    • Ex Libris
    • Frontiers
    • Gale
    • PLOS
    • Scholastic
  • New Resources
    • Dashboards
    • Data Files
    • Digital Collections
    • Digital Preservation
    • Interactive Tools
    • Maps
    • Other
    • Podcasts
    • Productivity
  • New Research
    • Conference Presentations
    • Journal Articles
    • Lecture
    • New Issue
    • Reports
  • Topics
    • Archives & Special Collections
    • Associations & Organizations
    • Awards
    • Funding
    • Interviews
    • Jobs
    • Management & Leadership
    • News
    • Patrons & Users
    • Preservation
    • Profiles
    • Publishing
    • Roundup
    • Scholarly Communications
      • Open Access

April 13, 2011 by Gary Price

Web Archiving: Cornell Selects Archive-It to Capture and Preserve 8 Million University Web Pages

April 13, 2011 by Gary Price

From The Cornell Daily Sun:

Internet Archive will begin preserving Cornell’s online content starting this month after the University signed a contract with the Internet archiving company in March.

Internet Archive will create an archive of Cornell’s entire web space — approximately eight million documents — by capturing HTML coding, images, PDFs and links to external pages, according to Dean Krafft, Cornell library chief technology strategist, who is overseeing the project.

Cornell workers are beginning to use Internet Archive’s “Archive-It” function to make test scans, or “crawls,” of Cornell’s Internet domain, Krafft said. A complete crawl of the Cornell domain will occur two to three times a year, with the first one scheduled to take place within the next month, he said.

[Clip]

Cornell previously partnered with Archive-It in 2009 to provide nearly 80,000 free online books to the public, according to a press release by Cornell Libraries.

[Clip]

Kristine Hanna, Internet Archive’s director of archiving services said that about 90 university libraries use the Archive-It service to collect and archive digital content.

Cornell’s archived web pages will be available publicly on Archive-It.org, giving people access to information that may no longer be available as a result of updates or removal of pages, Earle said.

We’ve been and continue to be major fans of the Archive-It service and the Internet Archive. Here’s are two reasons why.

1. As the article points out the Cornell collection will be available on the web along with those from many other organizations (not only higher-ed).

2. A feature that Archive-It collections offer vs. The Wayback Machine (an essential tool also from Internet Archive) is that they’re full text searchable. Nice!

See Also: New: Japan Earthquake 2011 Web Archive
From the Internet Archive/Archive-It

Filed under: Associations and Organizations, Libraries, Resources

SHARE:

Archive-ItCornell UniversityDigitized Archives & LibrariesInternet ArchiveWeb Archiving

About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.

ADVERTISEMENT

Archives

Job Zone

ADVERTISEMENT

Recent Articles on LJ

Certified Sustainable | Sustainability

Prince George’s County Memorial Library System Targeted by Anti-LGBTQIA+ Vandalism

Tour de France: A Watching, Reading, and Listening Guide | Your Home Librarian

Capitol Gains: ALA 2022 Preview

Positioned for Power: Hiring an EDI Officer | Equity

ADVERTISEMENT

Related Infodocket Posts

New Research on Privacy: "Online Tracking: A 1-Million-Site Measurement and Analysis"

From researchers at Princeton University, a new research paper (draft) titled, “Online tracking: A 1-million-site measurement and analysis” by Steven Englehardt and Arvind Narayanan From the Abstract We present the ...

New Online, Free Access: NLM Digitizes Nearly 200 Unique Early English Books

From the National Library of Medicine: The National Library of Medicine (NLM) announces the release through its Digital Collections of nearly 200 items uniquely held by the NLM and printed ...

The Wikipedia Library Project Announces Visiting Scholar Positions at Five Research Libraries

From Wikipedia Library/Wikimedia: The Wikipedia Library is pleased to announce five new Wikipedia Visiting Scholars positions with US and Canadian universities and research organizations as part of an program expansion. ...

A Look Inside the Design of Cornell's Ho Fine Arts Library (Scheduled to Open in 2017)

From Metropolis: As public libraries and media organizations turn increasingly away from books and other print publications, devoting funding instead to technology-based resources—access to databases, coworking spaces, and the like—it ...

Research Guide: "Online Resources for Writers: Advice, Tips, and Networking"

Here’s a link to new webliography published in the July/August 2015 issue of C&RL News (College and Research Libraries News). Title Online Resources for Writers: Advice, Tips, and Networking Author ...

Book Industry Study Group (BISG) Releases Taxonomy For Educational Materials

Ed. Note: BISAC Subject Headings also from BISG and mentioned below are primarily used in the book industry. However, we’ve posted and read about their use in both public and ...

Public Library of Science (PLOS) Releases Data Repository Recommendation Guide

From a PLOS Blog Post: In line with our updated Data Policy, we are pleased to announce a PLOS Data Repository Recommendation Guide. To support the selection of data repositories ...

Conference Paper: Digitization to Avoid Intellectual Content Loss From Natural Disasters (World Digital Library)

The following paper will be presented this August at the 2015 IFLA Annual Meeting/World Library Information Congress in Cape Town, South Africa. Title Digitization to Avoid Intellectual Content Loss From ...

Knight Foundation Announces Plan For a Second Knight News Challenge Focused on Library Innovation

From the Knight Foundation Blog: Today we are happy to share that we will run a second challenge focused on library innovation in 2016, and we would like your thoughts ...

ALA Releases National Policy Agenda for Libraries

ALA’s Policy Agenda for Libraries was released today as the 2015 ALA Annual Conference gets underway in San Francisco. The full text document is available here. A four page executive ...

ARL Signs Onto TOP Guidelines to Improve Research and Publishing Practices

From an Announcement Posted on the ARL Web Site by Judy Ruttenberg (Association of Research Libraries) and Sara Bowman (Center for Open Science): Today in Science, the Transparency and Openness ...

ALA Sends Letter to President Obama Urging Him to Select a Librarian as Next Librarian of Congress

We were very happy to read the letter (we’ve posted it below) from ALA President Courtney Young to President Obama urging him to select a professional librarian as the next ...

ADVERTISEMENT

FOLLOW INFODOCKET ON TWITTER

Tweets by @infodocket

ADVERTISEMENT

This coverage is free for all visitors. Your support makes this possible.

This coverage is free for all visitors. Your support makes this possible.

Primary Sidebar

  • News
  • Reviews+
  • Technology
  • Programs+
  • Design
  • Leadership
  • People
  • COVID-19
  • Advocacy
  • Opinion
  • INFOdocket
  • Job Zone

Reviews+

  • Booklists
  • Prepub Alert
  • Book Pulse
  • Media
  • Readers' Advisory
  • Self-Published Books
  • Review Submissions
  • Review for LJ

Awards

  • Library of the Year
  • Librarian of the Year
  • Movers & Shakers 2022
  • Paralibrarian of the Year
  • Best Small Library
  • Marketer of the Year
  • All Awards Guidelines
  • Community Impact Prize

Resources

  • LJ Index/Star Libraries
  • Research
  • White Papers / Case Studies

Events & PD

  • Online Courses
  • In-Person Events
  • Virtual Events
  • Webcasts
  • About Us
  • Contact Us
  • Advertise
  • Subscribe
  • Media Inquiries
  • Newsletter Sign Up
  • Submit Features/News
  • Data Privacy
  • Terms of Use
  • Terms of Sale
  • FAQs
  • Careers at MSI


© 2022 Library Journal. All rights reserved.


© 2022 Library Journal. All rights reserved.