SUBSCRIBE
SUBSCRIBE
EXPLORE +
  • About infoDOCKET
  • Academic Libraries on LJ
  • Research on LJ
  • News on LJ
  • Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Libraries
    • Academic Libraries
    • Government Libraries
    • National Libraries
    • Public Libraries
  • Companies (Publishers/Vendors)
    • EBSCO
    • Elsevier
    • Ex Libris
    • Frontiers
    • Gale
    • PLOS
    • Scholastic
  • New Resources
    • Dashboards
    • Data Files
    • Digital Collections
    • Digital Preservation
    • Interactive Tools
    • Maps
    • Other
    • Podcasts
    • Productivity
  • New Research
    • Conference Presentations
    • Journal Articles
    • Lecture
    • New Issue
    • Reports
  • Topics
    • Archives & Special Collections
    • Associations & Organizations
    • Awards
    • Funding
    • Interviews
    • Jobs
    • Management & Leadership
    • News
    • Patrons & Users
    • Preservation
    • Profiles
    • Publishing
    • Roundup
    • Scholarly Communications
      • Open Access

September 19, 2012 by Gary Price

Wikimedia/Wikipedia Begins Providing Access to Anonymized Search Log Files

September 19, 2012 by Gary Price

UPDATE (9/20): Wikipedia Releases Search Data To Public But Pulls It After Privacy Concerns (via Search Engine Land)
From a Wikimedia Blog Post by Diederik van Liere:

I am very happy to announce the availability of anonymous search log files for Wikipedia and its sister projects, as of today. Collecting data about search queries is important for at least three reasons:
+ it provides valuable feedback to our editor community, who can use it to detect topics of interest that are currently insufficiently covered.
+ we can improve our search index by benchmarking improvements against real queries.
+ we give outside researchers the opportunity to discover gems in the data.
[Clip]
Every day from today, we will publish the search queries for the previous day at: http://dumps.wikimedia.org/other/search/ (we expect to have a 3 month rolling window of search data available).
Each line in the log files is tab separated and it contains the following fields:
Server hostname
Timestamp (UTC)
Wikimedia project
URL encoded search query
Total number of results
Lucene score of best match
Interwiki result
Namespace (coded as integer)
Namespace (human-readable)
Title of best matching article
[Clip]
We are making this data available under a CC0 license: this means that you can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. But we do appreciate it if you cite us when you use this data source for your research, experimentation or product development.

Read the Complete Blog Post

Coverage

Wikimedia releases anonymous search log files for Wikipedia (via The Next Web)

Filed under: Data Files, News

SHARE:

About Gary Price

Gary Price (gprice@gmail.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com.

ADVERTISEMENT

Archives

Job Zone

ADVERTISEMENT

Related Infodocket Posts

ADVERTISEMENT

FOLLOW US ON X

Tweets by infoDOCKET

ADVERTISEMENT

This coverage is free for all visitors. Your support makes this possible.

This coverage is free for all visitors. Your support makes this possible.

Primary Sidebar

  • News
  • Reviews+
  • Technology
  • Programs+
  • Design
  • Leadership
  • People
  • COVID-19
  • Advocacy
  • Opinion
  • INFOdocket
  • Job Zone

Reviews+

  • Booklists
  • Prepub Alert
  • Book Pulse
  • Media
  • Readers' Advisory
  • Self-Published Books
  • Review Submissions
  • Review for LJ

Awards

  • Library of the Year
  • Librarian of the Year
  • Movers & Shakers 2022
  • Paralibrarian of the Year
  • Best Small Library
  • Marketer of the Year
  • All Awards Guidelines
  • Community Impact Prize

Resources

  • LJ Index/Star Libraries
  • Research
  • White Papers / Case Studies

Events & PD

  • Online Courses
  • In-Person Events
  • Virtual Events
  • Webcasts
  • About Us
  • Contact Us
  • Advertise
  • Subscribe
  • Media Inquiries
  • Newsletter Sign Up
  • Submit Features/News
  • Data Privacy
  • Terms of Use
  • Terms of Sale
  • FAQs
  • Careers at MSI


© 2026 Library Journal. All rights reserved.


© 2022 Library Journal. All rights reserved.