May 23, 2022

Scholarly Publishing: CrossRef Set to Launch a Text and Data Mining Service Named Prospect in Early 2014

In the new issue of CrossRef QuarterlyCrossRef’s Executive Director, Ed Pentz writes:

CrossRef’s text and data mining service was just approved by the board at its November meeting. [Clip] The service is scheduled to roll out in early 2014. There are two main aspects of the service:

1) a common text and data mining (TDM) API that enables researchers to request full text content from publisher sites in a standard way and

2) a terms and conditions library for those publishers who want to ask for researchers to agree to additional TDM terms. The terms and conditions library isn’t required for OA content or where publishers allow TDM by researchers at subscribing institutions.

Pentz also points out that more information is now available via links on this page including links to an overview as well as links to resources for publishers and researchers.

For Publishers

For Researchers

Why Prospect?

From the Overview:

  • Researchers are increasingly interested in text and data mining (TDM) published scholarly content. This poses technical and logistical problems for scholarly researchers and publishers alike.
  • All parties would benefit from support of standard APIs and data representations in order to enable TDM across both open access and subscription-based publishers.
  • Researchers find it impractical to negotiate multiple bilateral agreements with subscription-based publishers in order to get authorisation to TDM  subscribed content.
  • Subscription-based publishers find it impractical to negotiate multiple bilateral agreements with researchers and institutions in order to authorise TDM of subscribed content.

See Also: Full Text of CrossRef Quarterly (December 2013)
The issue includes the latest CrossRef Dashboard with a number of statistics for the first three quarters of 2013. It includes this summary:

The number of queries (references sent to CrossRef to find the DOI to create permanent links to the content) for first three quarters of 2013 were at 684 million and went down by 15% from the same period last year. The number of references submitted that matched a DOI in the CrossRef System increased to 309 million from 285 (in the same period of 2012). Total deposits increased, with current increasing by 34% and backfile slightly decreasing. The number of members and the amount of content in the system continues to grow at a steady rate.

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.