May 27, 2022

New Report: “What to Keep: A Jisc Research Data Study”

From an Introductory Blog Post by Caroline Ingram (via JISC):

To keep or not to keep? That is the question posed in a new report produced by Neil Beagrie, What to Keep: A Jisc research data study. With growing volumes and diversity of research data, the issue of what to keep has been growing in significance.

Who is the Study For?

The research carried out in this small scale study has produced insights that will be of value across university research and research management teams, as well as for funders and publishers. The report outlines use cases for retention of data, as well as suggestions for improvement.

The conclusion is that it is essential to consider not only what and why to keep data, but also where to keep it, how to keep it and how long for.


The study makes 10 recommendations alongside potential implementations in an attempt to move a substantial problem forward. Seven case studies have also been prepared to illustrate the approaches and rationale for what to keep in various scenarios.

Key Findings (via Executive Summary)

• The two major use cases and drivers for what to keep are Research Integrity and Reproducibility (availability of the data supporting the findings in research); and the Potential for Reuse (availability of data for sharing with other users)

• These two major use cases for keeping data may influence differently what is kept, for how long, where, and how it is kept and made accessible. Although these use cases can overlap it is important to recognise they are distinct and that they may even have distinct data types required to be kept to support them (see Case study 8.4: Research Integrity and Reproducibility for which the raw data and pipeline is needed; and Reuse for which only the end result/structure is useful)

• Not all research data is the same: it is highly varied in terms of data level; data type; and origin. In addition, not all disciplines are in the same place or have identical needs. Different disciplines are at different evolutionary stages and will not necessarily have identical use cases for what to keep or derive the same value from them

• Whilst some broad generic principles can apply to what to keep there will always be specific disciplinary and sub-disciplinary differences

• In the interviews and case studies there was interest in evolving disciplinary norms especially where these are currently not well defined; in what is transferable in terms of effective practice between disciplines; and in harmonisation of funder requirements where relevant

• Research grant terms and other legal requirements (eg for clinical trials data) can specify a minimum term for which research data must be kept and at a basic level that sets one simple retention criterion.

However, as these dates begin to expire an increasing number of datasets will need review and potentially more complex appraisal decisions made on whether they are retained

• The comparison of appraisal and selection criteria from What to Keep and existing checklist criteria suggests that a broad consensus has emerged around the key high-level generic criteria that are useful for what to keep. These broadly-agreed generic criteria are now being applied in multiple domains. This suggests there are examples of effective practice in these checklists that can be promoted to others

Read the Complete Blog Post

Direct to Full Text Report: What to Keep: A Jisc Research Data Study
64 pages; PDF.

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.