SUBSCRIBE
SUBSCRIBE
EXPLORE +
  • About infoDOCKET
  • Academic Libraries on LJ
  • Research on LJ
  • News on LJ
  • Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Libraries
    • Academic Libraries
    • Government Libraries
    • National Libraries
    • Public Libraries
  • Companies (Publishers/Vendors)
    • EBSCO
    • Elsevier
    • Ex Libris
    • Frontiers
    • Gale
    • PLOS
    • Scholastic
  • New Resources
    • Dashboards
    • Data Files
    • Digital Collections
    • Digital Preservation
    • Interactive Tools
    • Maps
    • Other
    • Podcasts
    • Productivity
  • New Research
    • Conference Presentations
    • Journal Articles
    • Lecture
    • New Issue
    • Reports
  • Topics
    • Archives & Special Collections
    • Associations & Organizations
    • Awards
    • Funding
    • Interviews
    • Jobs
    • Management & Leadership
    • News
    • Patrons & Users
    • Preservation
    • Profiles
    • Publishing
    • Roundup
    • Scholarly Communications
      • Open Access

January 4, 2013 by Gary Price

The Library of Congress Posts Update and Releases Report About What’s Going On With Their Twitter Archive

January 4, 2013 by Gary Price

Update Digital Preservation expert and Founder of LOCKSS, Dr. David Rosenthal offer some analysis of the amount of data the archive contains. Hat Tip: @lorcand
—
The Library of Congress is out with a blog post and white paper (embedded below) that provides info about the complete archive of  tweets that Twitter donated to The Library of Congress.
The donation was first announced on April 15, 2010 in blog posts by LC and Twitter.
Since then LC has remained very quiet with details about how the the Twitter archive might be used and if it would be available to the public either online or in person at LC.
While LC officials did make comments from time to time almost no new details emerged although we asked…a lot. We never understood (and still don’t) why LC has been so tight-lipped about this project.
One thing we did learn was that a Boulder, CO company named Gnip was working with LC to build the archive. By the way, Gnip is also provides (fee-based) exclusive access of every publicly available tweet back to 2006.

Today’s Update

Today, almost 1000 days after it was first announced, LC’s  Director of Communications, Gayle Osterberg, has written a blog post with an update about the LC’s Twitter archive.

Key Points from the Blog Post

  • Archive of tweets from 2006-2010 now complete.
  • Contains 170 billion tweets.
  • “The volume of tweets the Library receives each day has grown from 140 million beginning in February 2011 to nearly half a billion tweets each day as of October 2012.”
  • LC’s focus is now, “addressing the significant technology challenges to making the archive accessible to researchers in a comprehensive, useful way.”
  • Getting this done is a priority for LC
  • LC has received more than 400 requests from researchers to use archive

It’s good to learn some new details about how the project is going.
However, the post and report lack specifics about:

  • Access to the archive (Who will be able to access? How will the process work?)
  • A preliminary/tentative timeline about when this access might become available. Later this year? Next year?
  • Details about the technology that will be used to search, organize tweets?
  • We did learn when the project launched that the Computational Approaches to Digital Stewardship partnership between Stanford and LC might be involved. Are they? Were they?
  • Why LC has been so quiet about how the project was developing.

The Washington Post has a story about the Twitter archive that includes several interesting details (not included in the LC document) that helps answer some of the questions listed above. This article includes several quotes from Deputy Librarian of Congress Robert Dizard that makes it sound like providing access for researchers will not be taking place anytime soon.
See: “Library of Congress has archive of tweets, but no plan for its public display.”
On the Data LC Has Now Archived

“It’s pretty raw,” [Deputy Librarian of Congress Robert] Dizard said. “You often hear a reference to Twitter as a fire hose, that constant stream of tweets going around the world. What we have here is a large and growing lake. What we need is the technology that allows us to both understand and make useful that lake of information.”

On Access

For now, giving researchers access to the archive remains cost-prohibitive for the cash-strapped library, which has spent tens of thousands of dollars on the project so far, Dizard says.
“We know from the testing we’ve done with even small parts of the data that we are not going to be able to, on our own, provide really useful access at a cost that is reasonable for us,” Dizard said. “For even just the 2006 to 2010 [portion of the] archive, which is about 21 billion tweets, just to do one search could take 24 hours using our existing servers.”

Future Plans

The eventual plan is to make the collection available only within the Library of Congress reading rooms. Requiring an in-person visit to search a database of material that originated online may seem incongruous, but Dizard says it’s a condition of the deal with Twitter, which gifted the archive, so that the library won’t be “competing with the commercial sector.”

Finally, here’s the complete white paper that LC made available to day. The section titled, “The Library of Congress Agreement with Twitter” includes details that have not been made public to this point although we asked LC several times back when the archive project was first announced.
Update on the Twitter Archive At The Library of Congress

Filed under: Data Files, Digital Preservation, Journal Articles, Libraries, News, Preservation

SHARE:

Archivesljsocial mediaTwitter

About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.

ADVERTISEMENT

Archives

Job Zone

ADVERTISEMENT

Recent Articles on LJ

After the MLIS

Proud Boys Disrupt Drag Queen Story Time at San Lorenzo Library

From the Top: Library Leaders Talk EDI | Equity

Positioned for Power: Hiring an EDI Officer | Equity

Capitol Gains: ALA 2022 Preview

ADVERTISEMENT

Related Infodocket Posts

Report: "The Important Role Libraries Play in Building a Creative and Innovative Society"

From ArchDaily: As gateways to knowledge and culture, libraries play a fundamental role in society. Foundational in creating opportunities for learning, as well as supporting literacy and education, the resources ...

Not Real News: An Associated Press Roundup of Untrue Stories Shared Widely on Social Media This Week

From the Associated Press: A roundup of some of the most popular but completely untrue stories and visuals of the week. None of these are legit, even though they were ...

Statement: American Library Association (ALA) Condemns Threats of Violence in Libraries

Full Text of ALA Statement (6/24): In response to the alarming increase in acts of aggression toward library workers and patrons as reported by press across the country, the American ...

Roundup (June 24, 2022)

FCC and IMLS Sign Agreement to Promote Broadband Access More Than Fifty Libraries and Library Systems Live on EBSCO FOLIO Library Services Platform NIST Releases New Guidance and Resources on ...

Report: "Vatican Releases Thousands of Holocaust-Era Letters and Requests Online"

From the Associated Press (via Times of Israel): Pope Francis orders the online publication of 170 volumes of its Jewish files from the recently opened Pope Pius XII archives, the ...

The New York Public Library Opens a ‘Virtual Branch’ on Instagram and Launches a Reading Recommendation Project Using...

From NYPL: The virtual branch— a custom designed interactive AR (Augmented Reality) Effect accessible via Instagram Reels is the centerpiece of #NYPLSummerBookshelf, a new initiative to spark a love of ...

Roundup (June 23, 2022)

CLIR Invites Proposals for Pocket Burgundy Series (via Council on Library and Information Resources) Oregon’s State Library added to National Register of Historic Places (via Oregon Capital Chronicle)

State of New York Releases First-Of-Its Kind Statewide Address-Level Broadband Map

From GCN: An address-level, interactive broadband map will help officials in New York explore statewide high-speed internet availability, assess connectivity needs and better allocate state and federal funding. The map ...

Journal Article: "Rarely Analyzed: The Relationship Between Digital and Physical Rare Books Collections"

The article linked below was recently published by Information Technology and Libraries. Title Rarely Analyzed: The Relationship Between Digital and Physical Rare Books Collections Authors Allison McCormack University of Utah ...

Mellon Foundation Awards $600,000 to Digital Preservation Outreach and Education Network

From The Pratt Institute: The Mellon Foundation has awarded the Pratt Institute School of Information $600,000 to support the Digital Preservation Outreach and Education Network (DPOE-N) in collaboration with the ...

DPLA Receives $150,000 Grant From the Knight Foundation to Expand the Palace Marketplace and Palace Bookshelf

From a DPLA Announcement: DPLA’s ebook work is a key part of our mission to advance digital access to knowledge for all. Earlier this month, The Palace Project app and platform ...

Charles Watkinson Takes Office as AUPresses President

From an AUPresses Announcement: Charles Watkinson, director of the University of Michigan Press, has stepped into the presidency of the Association of University Presses. Watkinson, who also serves as associate ...

ADVERTISEMENT

FOLLOW INFODOCKET ON TWITTER

Tweets by @infodocket

ADVERTISEMENT

This coverage is free for all visitors. Your support makes this possible.

This coverage is free for all visitors. Your support makes this possible.

Primary Sidebar

  • News
  • Reviews+
  • Technology
  • Programs+
  • Design
  • Leadership
  • People
  • COVID-19
  • Advocacy
  • Opinion
  • INFOdocket
  • Job Zone

Reviews+

  • Booklists
  • Prepub Alert
  • Book Pulse
  • Media
  • Readers' Advisory
  • Self-Published Books
  • Review Submissions
  • Review for LJ

Awards

  • Library of the Year
  • Librarian of the Year
  • Movers & Shakers 2022
  • Paralibrarian of the Year
  • Best Small Library
  • Marketer of the Year
  • All Awards Guidelines
  • Community Impact Prize

Resources

  • LJ Index/Star Libraries
  • Research
  • White Papers / Case Studies

Events & PD

  • Online Courses
  • In-Person Events
  • Virtual Events
  • Webcasts
  • About Us
  • Contact Us
  • Advertise
  • Subscribe
  • Media Inquiries
  • Newsletter Sign Up
  • Submit Features/News
  • Data Privacy
  • Terms of Use
  • Terms of Sale
  • FAQs
  • Careers at MSI


© 2022 Library Journal. All rights reserved.


© 2022 Library Journal. All rights reserved.