May 29, 2022

Data Preservation: White Paper Urges New Approaches to Assure Access to Scientific Data

From the University of Michigan:

A newly released white paper calls for new approaches for preserving scientific data and sustainable funding of domain repositories—data archives with ties to specific scientific communities.

“Sustaining Domain Repositories for Digital Data: A White Paper” is the result of a meeting last summer that brought together representatives of 22 data repositories serving the social, natural and physical sciences. The meeting at the University of Michigan was organized by the Inter-university Consortium for Political and Social Research, part of the U-M Institute for Social Research.

Domain repositories accelerate intellectual discovery by facilitating data reuse and reproducibility. They leverage indepth subject knowledge as well as expertise in data curation to make data accessible and meaningful to specific scientific communities.

However, domain repositories face an uncertain financial future in the United States, as funding remains unpredictable and inadequate, says ICPSR Director George Alter. Unlike our European competitors who support data archiving as necessary scientific infrastructure, the US does not assure the long-term viability of data archives, he says.


Five recommendations are offered to encourage data stewardship and support sustainable repositories:

  • Commit to sustaining institutions that assure the long-term preservation and viability of research data
  • Promote cooperation among funding agencies, universities, domain repositories, journals and other stakeholders
  • Support the human and organizational infrastructure for data stewardship as well as the hardware
  • Establish review criteria appropriate for data repositories
  • Provide incentives to principal investigators to archive data

Two Paragraphs From the Report

On the Workforce

There is a shortage of qualified people to work at repositories, particularly in developing data management systems, defining data models, and programming. Repositories need a balance between people who understand curation, the technology, and the science. Moreover, there is a lot of workforce turnover, so substantial effort is going into training and retraining; once people are trained they become desirable in the market and there is competition for them. This leads to a lack of continuity and a loss of institutional knowledge.

In terms of funding, there is a mismatch between budget levels and the salaries required to retain good people.

[Emphasis Ours] There is also a mismatch between job classifications like archivist and librarian, and the actual demands of the data world. The implication is that repositories cannot operate effectively on soft money; there needs to be an underlying sustainable funding base.

On Institutional Repositories 

Institutional repositories (IRs) can be beneficial in that they are permanent; libraries have invested in them and they are established institutions, so the data they hold can be expected to be supported long-term. They can also get data that would otherwise be lost. IRs sometimes collaborate with research teams and are a source for gathering information during active creation. The tremendous diversity of data held in IRs means that metadata is not as rich as in a domain repository, thus compromising discoverability and interoperability, and most IRs lack the capacity to migrate data to new formats as software changes. Data curation experience, expertise, and capacity at IRs are also limited.

Direct to Full Text White Paper (16 pages; PDF)

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.