January 20, 2022

New Report from JISC: "Value and Benefits of Text Mining"

A new report from the JISC. Focus is UK (JISC is a UK organization) but we think the report will be of interest and value to everyone interested in the topic no matter where they work and live.

From a News Release/Summary:

A new JISC report shows that text mining – a complex and innovative method of searching and analysing data – has huge potential benefits for the UK economy and knowledge base, but its use is being held back by copyright law and other barriers.

Sir Mark Walport, the director of the Wellcome Trust, said at a related event last night: “This is a complete no-brainer. This is scholarly research funded from the public purse, largely from taxpayer and philanthropic organisations. The taxpayer has the right to have maximum benefit extracted and that will only happen if there is maximum access to it.”

Text mining draws on data analysis techniques such as natural language processing and information extraction to find new knowledge and meaningful patterns within large collections.


The report identifies a number of barriers that we need to overcome to make best use of text mining tools in the future. Firstly, text mining is a complex technical process that requires skilled staff; secondly it requires unrestricted access to information sources; thirdly copyright can be a barrier.

The report authors conclude that more work needs to be undertaken to raise awareness of the potential benefits and value of text mining to UK further and higher education.

From the Introduction of the Report

The global research community generates over 1.5 million new scholarly articles per annum. As the recent Hargreaves report into ‘Digital Opportunity: A Review of Intellectual Property and Growth’ highlighted, text mining and analytics of this scholarly literature and other digitised text affords a real opportunity to support innovation and the development of new knowledge. However, current UK copyright laws are restricting this use of text mining. To remedy this, Hargreaves proposes an exception to support text mining and analytics for non-commercial research.

In order to be ‘mined’, text must be accessed, copied, analysed, annotated and related to existing information and understanding. Even if the user has access rights to the material, making annotated copies can be illegal under current copyright law without the permission of the copyright holder.

Direct to Full Text Report (HTML) ||| PDF Version (32 pages; PDF)

About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.