May 17, 2022

Say Hello to allofPLOS: Public Library of Science Announces Unrestricted, No Conditions Text and Data Mining Access to Entire PLOS Corpus

From an Official PLOS Blog Post by Sheryl Denker:

A study posted on bioRxiv found that text mining full articles gave significantly better information that mining abstracts only, as expected. However, the authors of this study described challenges in the way content was presented and in the need to obtain copyright permissions. In addition to content availability and license status, support for early adopters and training for future practitioners are also cited as barriers to broad use of TDM for research purposes.

Source:  Slogan on PLOS Text and Data Mining Page (Nov. 28, 2017)

Source: Slogan on PLOS Text and Data Mining Page (Nov. 28, 2017)

The foundational value of CC BY licensing for TDM is that no additional permissions or documentation are required. Open Access facilitates TDM:

  • not on case-by-case basis, but for all people, in all places, and at all times
  • without lengthy legal agreements or restrictions
  • by providing unrestricted reuse, remix and mining rights

With more than 200,000 fully Open Access research articles available for content mining, PLOS can help advance the discussion and application of content mining through real-world experiences.

Through our API we provide article text and meta-data in a single XML file format according to the Journal Article Tag Suite (JATS), the National Information Standards Organization (NISO) standard tag suite for archiving and exchanging journal article content.

Learn More, Read the Complete Blog Post

Direct to allofPLOS Web Site ||| Download Complete Corpus (4.5 GB via Google Drive)

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.