Author
Listed:
- Lisa R Johnston
- Renata Curty
- Susan M Braxton
- Jake Carlson
- Hannah Hadley
- Sophia Lafferty-Hess
- Hoa Luong
- Jonathan L Petters
- Wendy A Kozlowski
Abstract
Data curators play an important role in assessing data quality and take actions that may ultimately lead to better, more valuable data products. This study explores the curation practices of data curators working within US-based data repositories. We performed a survey in January 2021 to benchmark the levels of curation performed by repositories and assess the perceived value and impact of curation on the data sharing process. Our analysis included 95 responses from 59 unique data repositories. Respondents primarily were professionals working within repositories and examined curation performed within a repository setting. A majority 72.6% of respondents reported that “data-level” curation was performed by their repository and around half reported their repository took steps to ensure interoperability and reproducibility of their repository’s datasets. Curation actions most frequently reported include checking for duplicate files, reviewing documentation, reviewing metadata, minting persistent identifiers, and checking for corrupt/broken files. The most “value-add” curation action across generalist, institutional, and disciplinary repository respondents was related to reviewing and enhancing documentation. Respondents reported high perceived impact of curation by their repositories on specific data sharing outcomes including usability, findability, understandability, and accessibility of deposited datasets; respondents associated with disciplinary repositories tended to perceive higher impact on most outcomes. Most survey participants strongly agreed that data curation by the repository adds value to the data sharing process and that it outweighs the effort and cost. We found some differences between institutional and disciplinary repositories, both in the reported frequency of specific curation actions as well as the perceived impact of data curation. Interestingly, we also found variation in the perceptions of those working within the same repository regarding the level and frequency of curation actions performed, which exemplifies the complexity of a repository curation work. Our results suggest data curation may be better understood in terms of specific curation actions and outcomes than broadly defined curation levels and that more research is needed to understand the resource implications of performing these activities. We share these results to provide a more nuanced view of curation, and how curation impacts the broader data lifecycle and data sharing behaviors.
Suggested Citation
Lisa R Johnston & Renata Curty & Susan M Braxton & Jake Carlson & Hannah Hadley & Sophia Lafferty-Hess & Hoa Luong & Jonathan L Petters & Wendy A Kozlowski, 2024.
"Understanding the value of curation: A survey of US data repository curation practices and perceptions,"
PLOS ONE, Public Library of Science, vol. 19(6), pages 1-23, June.
Handle:
RePEc:plo:pone00:0301171
DOI: 10.1371/journal.pone.0301171
Download full text from publisher
Most related items
These are the items that most often cite the same works as this one and are cited by the same works as this one.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0301171. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.