A study posted on bioRxiv found that text mining full articles gave significantly better information that mining abstracts only, as expected. However, the authors of this study described challenges in the way content was presented and in the need to obtain copyright permissions. In addition to content availability and license status, support for early adopters and training for future practitioners are also cited as barriers to broad use of TDM for research purposes.
The foundational value of CC BY licensing for TDM is that no additional permissions or documentation are required. Open Access facilitates TDM:
- not on case-by-case basis, but for all people, in all places, and at all times
- without lengthy legal agreements or restrictions
- by providing unrestricted reuse, remix and mining rights
With more than 200,000 fully Open Access research articles available for content mining, PLOS can help advance the discussion and application of content mining through real-world experiences.
Through our API we provide article text and meta-data in a single XML file format according to the Journal Article Tag Suite (JATS), the National Information Standards Organization (NISO) standard tag suite for archiving and exchanging journal article content.
Say Hello to allofPLOS: Public Library of Science Announces Unrestricted, No Conditions Text and Data Mining Access to Entire PLOS Corpus
Filed by November 28, 2017on