IDEAS home Printed from https://ideas.repec.org/p/hal/journl/hal-05252424.html
   My bibliography  Save this paper

Daily life in the Open Biologist’s second job, as a Data Curator

Author

Listed:
  • Livia C.T. Scorza

    (The University of Edinburgh)

  • Tomasz Zieliński

    (The University of Edinburgh)

  • Irina Kalita

    (The University of Edinburgh)

  • Alessia Lepore

    (The University of Edinburgh, LOB - Laboratoire d'Optique et Biosciences - X - École polytechnique - IP Paris - Institut Polytechnique de Paris - INSERM - Institut National de la Santé et de la Recherche Médicale - CNRS - Centre National de la Recherche Scientifique)

  • Meriem El Karoui

    (The University of Edinburgh, LBPA - Laboratoire de biologie et pharmacologie appliquée - Université Paris-Saclay - CNRS - Centre National de la Recherche Scientifique - ENS Paris Saclay - Ecole Normale Supérieure Paris-Saclay)

  • Andrew J Millar

    (The University of Edinburgh)

Abstract

Background: Data reusability is the driving force of the research data life cycle. However, implementing strategies to generate reusable data from the data creation to the sharing stages is still a significant challenge. Even when datasets supporting a study are publicly shared, the outputs are often incomplete and/or not reusable. The FAIR (Findable, Accessible, Interoperable, Reusable) principles were published as a general guidance to promote data reusability in research, but the practical implementation of FAIR principles in research groups is still falling behind. In biology, the lack of standard practices for a large diversity of data types, data storage and preservation issues, and the lack of familiarity among researchers are some of the main impeding factors to achieve FAIR data. Past literature describes biological curation from the perspective of data resources that aggregate data, often from publications. Methods: Our team works alongside data-generating, experimental researchers so our perspective aligns with publication authors rather than aggregators. We detail the processes for organizing datasets for publication, showcasing practical examples from data curation to data sharing. We also recommend strategies, tools and web resources to maximize data reusability, while maintaining research productivity. Conclusion: We propose a simple approach to address research data management challenges for experimentalists, designed to promote FAIR data sharing. This strategy not only simplifies data management, but also enhances data visibility, recognition and impact, ultimately benefiting the entire scientific community

Suggested Citation

  • Livia C.T. Scorza & Tomasz Zieliński & Irina Kalita & Alessia Lepore & Meriem El Karoui & Andrew J Millar, 2024. "Daily life in the Open Biologist’s second job, as a Data Curator," Post-Print hal-05252424, HAL.
  • Handle: RePEc:hal:journl:hal-05252424
    DOI: 10.12688/wellcomeopenres.22899.1
    Note: View the original document on HAL open archive server: https://cnrs.hal.science/hal-05252424v1
    as

    Download full text from publisher

    File URL: https://cnrs.hal.science/hal-05252424v1/document
    Download Restriction: no

    File URL: https://libkey.io/10.12688/wellcomeopenres.22899.1?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Carol Tenopir & Natalie M Rice & Suzie Allard & Lynn Baird & Josh Borycz & Lisa Christian & Bruce Grant & Robert Olendorf & Robert J Sandusky, 2020. "Data sharing, management, use, and reuse: Practices and perceptions of scientists worldwide," PLOS ONE, Public Library of Science, vol. 15(3), pages 1-26, March.
    2. Christopher Allen & David M A Mehler, 2019. "Open science challenges, benefits and tips in early career and beyond," PLOS Biology, Public Library of Science, vol. 17(5), pages 1-14, May.
    3. repec:plo:pbio00:3000587 is not listed on IDEAS
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Abrams, Ellen & Leone, Paolo V. & Cambrosio, Alberto & Faraj, Samer, 2025. "The governance of open science: A comparative analysis of two open science consortia," Research Policy, Elsevier, vol. 54(3).
    2. Claire M Mason & Paul J Box & Shanae M Burns, 2020. "Research data sharing in the Australian national science agency: Understanding the relative importance of organisational, disciplinary and domain-specific influences," PLOS ONE, Public Library of Science, vol. 15(8), pages 1-17, August.
    3. Shuaijun Guo & Xiaoming Yu & Orkan Okan, 2020. "Moving Health Literacy Research and Practice towards a Vision of Equity, Precision and Transparency," IJERPH, MDPI, vol. 17(20), pages 1-14, October.
    4. Andrea K. Thomer, 2022. "Integrative data reuse at scientifically significant sites: Case studies at Yellowstone National Park and the La Brea Tar Pits," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 73(8), pages 1155-1170, August.
    5. Pablo Dorta-González & Sara M. González-Betancor & María Isabel Dorta-González, 2021. "To what extent is researchers' data-sharing motivated by formal mechanisms of recognition and credit?," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(3), pages 2209-2225, March.
    6. Lisa Spitzer & Stefanie Mueller, 2023. "Registered report: Survey on attitudes and experiences regarding preregistration in psychological research," PLOS ONE, Public Library of Science, vol. 18(3), pages 1-34, March.
    7. repec:plo:pone00:0237140 is not listed on IDEAS
    8. Christian M. Stracke & Daniel Burgos & Gema Santos-Hermosa & Aras Bozkurt & Ramesh Chander Sharma & Cécile Swiatek Cassafieres & Andreia Inamorato dos Santos & Jon Mason & Ebba Ossiannilsson & Jin Gon, 2022. "Responding to the Initial Challenge of the COVID-19 Pandemic: Analysis of International Responses and Impact in School and Higher Education," Sustainability, MDPI, vol. 14(3), pages 1-23, February.
    9. Joshua Borycz & Robert Olendorf & Alison Specht & Bruce Grant & Kevin Crowston & Carol Tenopir & Suzie Allard & Natalie M. Rice & Rachael Hu & Robert J. Sandusky, 2023. "Perceived benefits of open data are improving but scientists still lack resources, skills, and rewards," Humanities and Social Sciences Communications, Palgrave Macmillan, vol. 10(1), pages 1-12, December.
    10. Shinichi Nakagawa & Malgorzata Lagisz & Yefeng Yang & Szymon M Drobniak, 2024. "Finding the right power balance: Better study design and collaboration can reduce dependence on statistical power," PLOS Biology, Public Library of Science, vol. 22(1), pages 1-17, January.
    11. Datta, Hannes & Schütt, Harm, 2022. "Building a strategic advantage with Open Science," Other publications TiSEM e74b06f9-2ffb-41ee-8a0a-3, Tilburg University, School of Economics and Management.
    12. Jeng-Chieh Cheng & Jeen-Fong Li & Chi-Yo Huang, 2023. "Enablers for Adopting Restriction of Hazardous Substances Directives by Electronic Manufacturing Service Providers," Sustainability, MDPI, vol. 15(16), pages 1-45, August.
    13. Gretchen R. Stahlman, 2022. "From nostalgia to knowledge: Considering the personal dimensions of data lifecycles," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 73(12), pages 1692-1705, December.
    14. repec:plo:pbio00:3000763 is not listed on IDEAS
    15. Tang, Xuli & Li, Xin & Ding, Ying & Song, Min & Bu, Yi, 2020. "The pace of artificial intelligence innovations: Speed, talent, and trial-and-error," Journal of Informetrics, Elsevier, vol. 14(4).
    16. Avijit Gayen & Somyajit Chakraborty & Saikat Mitra & Angshuman Jana, 2025. "Comeback or dropout: study of discontinued researchers at early career stage," Scientometrics, Springer;Akadémiai Kiadó, vol. 130(2), pages 1201-1236, February.
    17. Rosa Virginia Encinas Quille & Felipe Valencia de Almeida & Mauro Yuji Ohara & Pedro Luiz Pizzigatti Corrêa & Leandro Gomes de Freitas & Solange Nice Alves-Souza & Jorge Rady de Almeida & Maggie Davis, 2023. "Architecture of a Data Portal for Publishing and Delivering Open Data for Atmospheric Measurement," IJERPH, MDPI, vol. 20(7), pages 1-20, April.
    18. Ben G Fitzpatrick & Dennis M Gorman & Caitlin Trombatore, 2024. "Impact of redefining statistical significance on P-hacking and false positive rates: An agent-based model," PLOS ONE, Public Library of Science, vol. 19(5), pages 1-18, May.
    19. Jens Rommel & Meike Weltin, 2021. "Is There a Cult of Statistical Significance in Agricultural Economics?," Applied Economic Perspectives and Policy, John Wiley & Sons, vol. 43(3), pages 1176-1191, September.
    20. Micheletti, Tatiane & Wimmler, Marie-Christin & Berger, Uta & Grimm, Volker & McIntire, Eliot J., 2024. "Beyond guides, protocols and acronyms: Adoption of good modelling practices depends on challenging academia's status quo in ecology," Ecological Modelling, Elsevier, vol. 496(C).
    21. Shaw, Steven D. & Nave, Gideon, 2023. "Don't hate the player, hate the game: Realigning incentive structures to promote robust science and better scientific practices in marketing," Journal of Business Research, Elsevier, vol. 167(C).
    22. S. Van Cranenburgh & S. Wang & A. Vij & F. Pereira & J. Walker, 2021. "Choice modelling in the age of machine learning -- discussion paper," Papers 2101.11948, arXiv.org, revised Nov 2021.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;
    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hal:journl:hal-05252424. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: CCSD (email available below). General contact details of provider: https://hal.archives-ouvertes.fr/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.