Note: After we use Google Dataset Search for a few weeks (it’s launching today), we plan to share our thoughts in an update to this post. Stay tuned.
From the Google Blog:
…Dataset Search lets you find datasets wherever they’re hosted, whether it’s a publisher’s site, a digital library, or an author’s personal web page.
To create Dataset search, we developed guidelines for dataset providers to describe their data in a way that Google (and other search engines) can better understand the content of their pages. These guidelines include salient information about datasets: who created the dataset, when it was published, how the data was collected, what the terms are for using the data, etc.
We then collect and link this information, analyze where different versions of the same dataset might be, and find publications that may be describing or discussing the dataset.
Our approach is based on an open standard for describing this information (schema.org) and anybody who publishes data can describe their dataset this way.
In this new release, you can find references to most datasets in environmental and social sciences, as well as data from other disciplines including government data and data provided by news organizations, such as ProPublica.
Dataset Search works in multiple languages with support for additional languages coming soon. Simply enter what you are looking for and we will help guide you to the published dataset on the repository provider’s site.
This launch is one of a series of initiatives to bring datasets more prominently into our products. We recently made it easier to discover tabular data in Search, which uses this same metadata along with the linked tabular data to provide answers to queries directly in search results.
Example Search: Libraries
Direct to Google Dataset Search