May 18, 2022

Open Data: Basic Metadata for the Museum of Modern Art (MOMA) Collection Now Public Domain With CC0 License

From a Blog Post by MOMA’s Fiona Romeo:

MoMA accessioned the Creative Commons License Symbol into its collection in March 2015 and it’s now on display in our design galleries as part of the exhibition This Is for Everyone: Design Experiments for the Common Good.


It therefore feels important that we just flipped our own default and shared data for more than 125,000 works from MoMA’s collection on GitHub using Creative Commons Zero (CC0).


MoMA’s open data is primarily intended to be useful to scholars, so it was important to make each version citable. Arfon Smith (@arfon), co-founder of the Zooniverse and a former collaborator of mine, is now leading GitHub’s engagement with the academic community. He shared this useful guide to producing citable code on GitHub using Zenodo. A Digital Object Identifier (DOI) is automatically created for every MoMA data release, and the data is also archived to the cloud infrastructure used by CERN’s Large Hadron Collider.

Learn More: Read the Complete Blog Post

Additional Info About the MOMA Data (via Github)

MoMA is committed to helping everyone understand, enjoy, and use our collection. The Museum’s website features almost 60,000 artworks from nearly 10,000 artists. This research dataset contains more than 120,000 records, representing all of the works that have been accessioned into MoMA’s collection and cataloged in our database. It includes basic metadata for each work, including title, artist, date made, medium, dimensions, and date acquired by the Museum. Some of these records have incomplete information and are noted as “not Curator Approved.”

At this time, the data is available in CSV format, encoded in UTF-8. While UTF-8 is the standard for multilingual character encodings, it is not correctly interpreted by Excel on a Mac. Users of Excel on a Mac can convert the UTF-8 to UTF-16 so the file can be imported correctly.

Hat Tip and Thanks: Matt R. Weaver

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.