The Conundrum of Sharing Research Data (Preprint)
University of California, Los Angeles
Journal of the American Society for Information Science and Technology
63(6): 1059-1078, 2012
New methods and instrumentation are producing an unprecedented deluge of research data. The volume of data, combined with new means of distribution and mining, has excited funding agencies, policy makers, and the general public with promises of discovery and innovation. Many stakeholders now expect data to be released, yet sharing is common in only a few fields such as astronomy and genomics. In other fields, some researchers share their data routinely, others never share their data, and most appear willing to share some of their data some of the time. Data sharing is thus a conundrum – “an intricate and difficult problem.” Research data take many forms, are collected for many purposes, using many approaches, and often are difficult to interpret once removed from their initial context. An analysis of data types and practices is illustrated with examples from the sciences, social sciences, and humanities. Four rationales for sharing data are examined: (1) to reproduce or to verify research, (2) to make results of publicly funded research available to the public, (3) to enable others to ask new questions of extant data, and (4) to advance the state of research and innovation. These rationales differ by the arguments for sharing, by beneficiaries, and by the motivations and incentives of the many stakeholders involved. The challenges are to understand which data might be shared, by whom, with whom, under what conditions, why, and to what effects. Answers to these questions will inform data policy and practice.