New Journal Article: “Assessing Research Data Deposits and Usage Statistics within IDEALS”
The following article was just published online by the Journal of eScience Librarianship.
Assessing Research Data Deposits and Usage Statistics within IDEALS
Christie A. Wiley
University of Illinois at Urbana-Champaign
Journal of eScience Librarianship
Objectives: This study follows up on previous work that began examining data deposited in an institutional repository. The work here extends the earlier study by answering the following lines of research questions: (1) What is the file composition of datasets ingested into the (institution blinded for review) campus repository? Are datasets more likely to be single-file or multiple-file items? (2) What is the usage data associated with these datasets? Which items are most popular?
Methods: The dataset records collected in this study were identified by filtering item types categorized as “data” or “dataset” using the advanced search function in IDEALS. Returned search results were collected in an Excel spreadsheet to include data such as the Handle identifier, date ingested, file formats, composition code, and the download count from the item’s statistics report. The Handle identifier represents the dataset record’s persistent identifier. Composition represents codes that categorize items as single or multiple file deposits. Date available represents the date the dataset record was published in the campus repository. Download statistics were collected via a website link for each dataset record and indicates the number of times the dataset record has been downloaded. Once the data was collected, it was used to evaluate datasets deposited into IDEALS.
Results: A total of 522 datasets were identified for analysis covering the period between January 2007 and August 2016. This study revealed two influxes occurring during the period of 2008-2009 and in 2014. During the first timeframe a large number of PDFs were deposited by the Illinois Department of Agriculture. Whereas, Microsoft Excel files were deposited in 2014 by the Rare Books and Manuscript Library. Single-file datasets clearly dominate the deposits in the campus repository. The total download count for all datasets was 139,663 and the average downloads per month per file across all datasets averaged 3.2.
Conclusion: Academic librarians, repository managers, and research data services staff can use the results presented here to anticipate the nature of research data that may be deposited within institutional repositories. With increased awareness, content recruitment, and improvements, IRs can provide a viable cyberinfrastructure for researchers to deposit data, but much can be learned from the data already deposited. Awareness of trends can help librarians facilitate discussions with researchers about research data deposits as well as better tailor their services to address short-term and long-term research needs.
Direct to Full Text Article (13 pages; PDF)
About Gary Price
Gary Price (firstname.lastname@example.org) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com. Gary is also the co-founder of infoDJ an innovation research consultancy supporting corporate product and business model teams with just-in-time fact and insight finding.