SUBSCRIBE
SUBSCRIBE
EXPLORE +
  • About infoDOCKET
  • Academic Libraries on LJ
  • Research on LJ
  • News on LJ
  • Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Libraries
    • Academic Libraries
    • Government Libraries
    • National Libraries
    • Public Libraries
  • Companies (Publishers/Vendors)
    • EBSCO
    • Elsevier
    • Ex Libris
    • Frontiers
    • Gale
    • PLOS
    • Scholastic
  • New Resources
    • Dashboards
    • Data Files
    • Digital Collections
    • Digital Preservation
    • Interactive Tools
    • Maps
    • Other
    • Podcasts
    • Productivity
  • New Research
    • Conference Presentations
    • Journal Articles
    • Lecture
    • New Issue
    • Reports
  • Topics
    • Archives & Special Collections
    • Associations & Organizations
    • Awards
    • Funding
    • Interviews
    • Jobs
    • Management & Leadership
    • News
    • Patrons & Users
    • Preservation
    • Profiles
    • Publishing
    • Roundup
    • Scholarly Communications
      • Open Access

November 3, 2016 by Gary Price

SherlockNet: British Library Flickr Commons Image Dataset UPDATED With More Tags and Thousands of Captions

November 3, 2016 by Gary Price

From a British Library “Digital Scholarship” Blog Post:

We have some exciting updates regarding SherlockNet, our ongoing efforts to using machine learning techniques to radically improve the discoverability of the British Library Flickr Commons image dataset.
Over the past 2 months we’ve been working on expanding and refining the set of tags assigned to each image. Initially, we set out simply to assign the images to one of 11 categories, which worked surprisingly well with less than a 20% error rate. But we realized that people usually search from a much larger set of words, and we spent a lot of time thinking about how we would assign more descriptive tags to each image.
[Clip]
For the past few weeks we’ve been working on the incorporation of ~20 million tags and related images and uploading them onto our website. Luckily Amazon Web Services provides comprehensive computing resources to take care of storing and transferring our data into databases to be queried by the front-end.
In order to make searching easier we’ve also added functionality to automatically include synonyms in your search. For example, you can type in “lady”, click on Synonym Search, and it adds “gentlewoman”, “ma’am”, “madam”, “noblewoman”, and “peeress” to your search as well. This is particularly useful in a tag-based indexing approach as we are using.
2016-11-03_08-56-28
As our data gets uploaded over the next days, you should begin to see our generated tags and related images show up on the website. You can click on each image to view it in more detail, or on each tag to re-query the website for that particular tag. This way users can easily browse relevant images or tags to find what they are interested in.
[Clip]
We will also be working on adding more advanced search capabilities via wrapper calls to the Flickr API. Proposed functionality will include logical AND and NOT operators, as well as better filtering by machine tags.
As mentioned in our previous post, we have been experimenting with techniques to automatically caption images with relevant natural language captions. Since an Artificial Intelligence (AI) is responsible for recognising, understanding, and learning proper language models for captions, we expected the task to be far harder than that of tagging, and although the final results we obtained may not be ready for a production-level archival purposes, we hope our work can help spark further research in this field.

Learn MUCH More About the Update to SherlockNet
Direct to SherlockNet Public Search Interface
See Also: SherlockNet: tagging and captioning the British Library’s Flickr images (August 22, 2016)
See Also: Research Paper: “SherlockNet: Exploring 400 Years of Western Book Illustrations With Convolutional Neural Networks”

Filed under: Data Files, Journal Articles, Libraries, News, Patrons and Users

SHARE:

About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.

ADVERTISEMENT

Archives

Job Zone

ADVERTISEMENT

Recent Articles on LJ

There Are No Lanes: Rural Libraries Do It ALL | Backtalk

Dartmouth Repatriates Samson Occom Papers to Mohegan Tribe

Tour de France: A Watching, Reading, and Listening Guide | Your Home Librarian

Proud Boys Disrupt Drag Queen Story Time at San Lorenzo Library

Texas A&M Changes Libraries, Rescinds Librarian Tenure

ADVERTISEMENT

Related Infodocket Posts

Wide Web Consortium (W3C) to Become a Public-Interest Non-Profit Organization

From a W3C Release: The World Wide Web Consortium is set to pursue 501(c)(3) non-profit status. The launch as a new legal entity in January 2023 preserves the core mission ...

Julie Mosbo Ballestro Appointed University Librarian at Texas A&M University

Full Text of a Texas A&M University Libraries Announcement: We are pleased to announce the appointment of Julie Mosbo Ballestro as University Librarian and Assistant Provost of University Libraries at ...

New Report From EBLIDA: "First European Overview on E-Lending in Public Libraries"

From an EBLIDA (European Bureau of Library, Information and Documentation Associations) Post: EBLIDA is laying the foundation for “sustainable copyright” in public libraries through the publication of the “First European ...

New Video Recording From Rare Book School: "Making and Reading Indigenous Archives"

The Rare Book School (U. of Virginia) video embedded below (a National Endowment for the Humanities-Global Book Histories Initiative Lecture by Kelly Wisecup) was recorded on June 15, 2022. From ...

New Funding: Digital Public Library of America (DPLA) Awarded $850,000 by Mellon Foundation to Support the Advancement of...

From a DPLA Announcement: Digital Public Library of America (DPLA) is pleased to announce an $850,000 grant from the Mellon Foundation to support its effort to advance racial justice in ...

Roundup (June 27, 2022)

Coherent Digital Launches South Asia Archive on the Coherent Commons Platform The Longest-Running Queer News Radio Show Is Headed to the Library of Congress (via NPR) University of Cambridge Now ...

Report: "The Important Role Libraries Play in Building a Creative and Innovative Society"

From ArchDaily: As gateways to knowledge and culture, libraries play a fundamental role in society. Foundational in creating opportunities for learning, as well as supporting literacy and education, the resources ...

Not Real News: An Associated Press Roundup of Untrue Stories Shared Widely on Social Media This Week

From the Associated Press: A roundup of some of the most popular but completely untrue stories and visuals of the week. None of these are legit, even though they were ...

Statement: American Library Association (ALA) Condemns Threats of Violence in Libraries

Full Text of ALA Statement (6/24): In response to the alarming increase in acts of aggression toward library workers and patrons as reported by press across the country, the American ...

Roundup (June 24, 2022)

FCC and IMLS Sign Agreement to Promote Broadband Access Library Impact Research Report: Impact of Archival Collections and Services on the Western University Department of History (via ARL) More Than ...

Report: "Vatican Releases Thousands of Holocaust-Era Letters and Requests Online"

From the Associated Press (via Times of Israel): Pope Francis orders the online publication of 170 volumes of its Jewish files from the recently opened Pope Pius XII archives, the ...

The New York Public Library Opens a ‘Virtual Branch’ on Instagram and Launches a Reading Recommendation Project Using...

From NYPL: The virtual branch— a custom designed interactive AR (Augmented Reality) Effect accessible via Instagram Reels is the centerpiece of #NYPLSummerBookshelf, a new initiative to spark a love of ...

ADVERTISEMENT

FOLLOW INFODOCKET ON TWITTER

Tweets by @infodocket

ADVERTISEMENT

This coverage is free for all visitors. Your support makes this possible.

This coverage is free for all visitors. Your support makes this possible.

Primary Sidebar

  • News
  • Reviews+
  • Technology
  • Programs+
  • Design
  • Leadership
  • People
  • COVID-19
  • Advocacy
  • Opinion
  • INFOdocket
  • Job Zone

Reviews+

  • Booklists
  • Prepub Alert
  • Book Pulse
  • Media
  • Readers' Advisory
  • Self-Published Books
  • Review Submissions
  • Review for LJ

Awards

  • Library of the Year
  • Librarian of the Year
  • Movers & Shakers 2022
  • Paralibrarian of the Year
  • Best Small Library
  • Marketer of the Year
  • All Awards Guidelines
  • Community Impact Prize

Resources

  • LJ Index/Star Libraries
  • Research
  • White Papers / Case Studies

Events & PD

  • Online Courses
  • In-Person Events
  • Virtual Events
  • Webcasts
  • About Us
  • Contact Us
  • Advertise
  • Subscribe
  • Media Inquiries
  • Newsletter Sign Up
  • Submit Features/News
  • Data Privacy
  • Terms of Use
  • Terms of Sale
  • FAQs
  • Careers at MSI


© 2022 Library Journal. All rights reserved.


© 2022 Library Journal. All rights reserved.