IDEAS home Printed from https://ideas.repec.org/a/taf/amstat/v72y2018i1p80-88.html
   My bibliography  Save this article

Packaging Data Analytical Work Reproducibly Using R (and Friends)

Author

Listed:
  • Ben Marwick
  • Carl Boettiger
  • Lincoln Mullen

Abstract

Computers are a central tool in the research process, enabling complex and large-scale data analysis. As computer-based research has increased in complexity, so have the challenges of ensuring that this research is reproducible. To address this challenge, we review the concept of the research compendium as a solution for providing a standard and easily recognizable way for organizing the digital materials of a research project to enable other researchers to inspect, reproduce, and extend the research. We investigate how the structure and tooling of software packages of the R programming language are being used to produce research compendia in a variety of disciplines. We also describe how software engineering tools and services are being used by researchers to streamline working with research compendia. Using real-world examples, we show how researchers can improve the reproducibility of their work using research compendia based on R packages and related tools.

Suggested Citation

  • Ben Marwick & Carl Boettiger & Lincoln Mullen, 2018. "Packaging Data Analytical Work Reproducibly Using R (and Friends)," The American Statistician, Taylor & Francis Journals, vol. 72(1), pages 80-88, January.
  • Handle: RePEc:taf:amstat:v:72:y:2018:i:1:p:80-88
    DOI: 10.1080/00031305.2017.1375986
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1080/00031305.2017.1375986
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1080/00031305.2017.1375986?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. C. Glenn Begley & Lee M. Ellis, 2012. "Raise standards for preclinical cancer research," Nature, Nature, vol. 483(7391), pages 531-533, March.
    2. Heather A Piwowar & Roger S Day & Douglas B Fridsma, 2007. "Sharing Detailed Research Data Is Associated with Increased Citation Rate," PLOS ONE, Public Library of Science, vol. 2(3), pages 1-5, March.
    3. Marcus R. Munafò & Brian A. Nosek & Dorothy V. M. Bishop & Katherine S. Button & Christopher D. Chambers & Nathalie Percie du Sert & Uri Simonsohn & Eric-Jan Wagenmakers & Jennifer J. Ware & John P. A, 2017. "A manifesto for reproducible science," Nature Human Behaviour, Nature, vol. 1(1), pages 1-9, January.
    4. Vinod, H. D., 2001. "Care and feeding of reproducible econometrics," Journal of Econometrics, Elsevier, vol. 100(1), pages 87-88, January.
    5. Martin Klein & Herbert Van de Sompel & Robert Sanderson & Harihar Shankar & Lyudmila Balakireva & Ke Zhou & Richard Tobin, 2014. "Scholarly Context Not Found: One in Five Articles Suffers from Reference Rot," PLOS ONE, Public Library of Science, vol. 9(12), pages 1-39, December.
    6. Richard Ball & Norm Medeiros, 2012. "Teaching Integrity in Empirical Research: A Protocol for Documenting Data Management and Analysis," The Journal of Economic Education, Taylor & Francis Journals, vol. 43(2), pages 182-189, April.
    7. Gentleman Robert, 2005. "Reproducible Research: A Bioinformatics Case Study," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 4(1), pages 1-25, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Andreoli-Versbach, Patrick & Mueller-Langer, Frank, 2014. "Open access to data: An ideal professed but not practised," Research Policy, Elsevier, vol. 43(9), pages 1621-1633.
    2. Mark J. McCabe & Frank Mueller-Langer, 2019. "Does Data Disclosure Increase Citations? Empirical Evidence from a Natural Experiment in Leading Economics Journals," JRC Working Papers on Digital Economy 2019-02, Joint Research Centre.
    3. Oliver Braganza, 2020. "A simple model suggesting economically rational sample-size choice drives irreproducibility," PLOS ONE, Public Library of Science, vol. 15(3), pages 1-19, March.
    4. Bernhard Voelkl & Lucile Vogt & Emily S Sena & Hanno Würbel, 2018. "Reproducibility of preclinical animal research improves with heterogeneity of study samples," PLOS Biology, Public Library of Science, vol. 16(2), pages 1-13, February.
    5. Fecher, Benedikt & Fräßdorf, Mathis & Hebing, Marcel & Wagner, Gert G., 2017. "Replikationen, Reputation und gute wissenschaftliche Praxis," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 68(2-3), pages 154-158.
    6. Erin C McKiernan, 2017. "Imagining the “open” university: Sharing scholarship to improve research and education," PLOS Biology, Public Library of Science, vol. 15(10), pages 1-25, October.
    7. Giovanni Baiocchi, 2007. "Reproducible research in computational economics: guidelines, integrated approaches, and open source software," Computational Economics, Springer;Society for Computational Economics, vol. 30(1), pages 19-40, August.
    8. Michaela Strinzel & Josh Brown & Wolfgang Kaltenbrunner & Sarah Rijcke & Michael Hill, 2021. "Ten ways to improve academic CVs for fairer research assessment," Palgrave Communications, Palgrave Macmillan, vol. 8(1), pages 1-4, December.
    9. Hussinger, Katrin & Pellens, Maikel, 2019. "Guilt by association: How scientific misconduct harms prior collaborators," Research Policy, Elsevier, vol. 48(2), pages 516-530.
    10. Piers Steel & Sjoerd Beugelsdijk & Herman Aguinis, 2021. "The anatomy of an award-winning meta-analysis: Recommendations for authors, reviewers, and readers of meta-analytic reviews," Journal of International Business Studies, Palgrave Macmillan;Academy of International Business, vol. 52(1), pages 23-44, February.
    11. Charles Ayoubi & Boris Thurm, 2023. "Knowledge diffusion and morality: Why do we freely share valuable information with Strangers?," Journal of Economics & Management Strategy, Wiley Blackwell, vol. 32(1), pages 75-99, January.
    12. Garret Christensen & Allan Dafoe & Edward Miguel & Don A Moore & Andrew K Rose, 2019. "A study of the impact of data sharing on article citations using journal policies as a natural experiment," PLOS ONE, Public Library of Science, vol. 14(12), pages 1-13, December.
    13. Hensel, Przemysław G., 2019. "Supporting replication research in management journals: Qualitative analysis of editorials published between 1970 and 2015," European Management Journal, Elsevier, vol. 37(1), pages 45-57.
    14. Severinsen, A. & Myrland, Ø., 2022. "ShinyRBase: Near real-time energy saving models using reactive programming," Applied Energy, Elsevier, vol. 325(C).
    15. Shane Timmons & Terence J. McElvaney & Peter D. Lunn, 2019. "An experiment for regulatory policy on broadband speed advertising," Journal of Behavioral Economics for Policy, Society for the Advancement of Behavioral Economics (SABE), vol. 3(2), pages 17-24, December.
    16. Benedikt Fecher & Sascha Friesike & Marcel Hebing, 2014. "What Drives Academic Data Sharing?," SOEPpapers on Multidisciplinary Panel Data Research 655, DIW Berlin, The German Socio-Economic Panel (SOEP).
    17. Albert J. Menkveld & Anna Dreber & Félix Holzmeister & Juergen Huber & Magnus Johannesson & Michael Kirchler & Sebastian Neusüss & Michael Razen & Utz Weitzel & Gunther Capelle-Blancard, 2021. "Non-Standard Errors," Documents de travail du Centre d'Economie de la Sorbonne 21033, Université Panthéon-Sorbonne (Paris 1), Centre d'Economie de la Sorbonne.
      • Menkveld, Albert J. & Dreber, Anna & Holzmeister, Felix & Huber, Juergen & Johannesson, Magnus & Kirchler, Michael & Neusüss, Sebastian & Razen, Michael & Weitzel, Utz & Abad-Díaz, David & Abudy, Mena, 2021. "Non-Standard Errors," Working Papers 2021:17, Lund University, Department of Economics.
      • Albert J. Menkveld & Anna Dreber & Felix Holzmeister & Juergen Huber & Magnus Johannesson & Michael Kirchler & Sebastian Neussüs & Michael Razen & Utz Weitzel & Christian Brownlees & Javier Gil-Bazo, 2021. "Non-Standard Errors," Working Papers 1303, Barcelona School of Economics.
      • Albert J. Menkveld & Anna Dreber & Felix Holzmeister & Juergen Huber & Magnus Johannesson & Michael Kirchler & Sebastian Neusüss & Michael Razen & Utz Weitzel & David Abad-Díaz & Menachem Abudy & To, 2021. "Non-Standard Errors," Working Paper Series, Social and Economic Sciences 2021-11, Faculty of Social and Economic Sciences, Karl-Franzens-University Graz.
      • Menkveld, Albert J. & Dreber, Anna & Holzmeister, Felix & Huber, Jürgen & Johannesson, Magnus & Kirchler, Michael & Neusüss, Sebastian & Razen, Michael & Weitzel, Utz, 2021. "Non-standard errors," IWH Discussion Papers 11/2021, Halle Institute for Economic Research (IWH).
      • Albert J. Menkveld & Anna Dreber & Felix Holzmeister & Juergen Huber & Magnus Johannesson & Michael Kirchler & Sebastian Neussüs & Michael Razen & Utz Weitzel & Christian T. Brownlees & Javier Gil-Baz, 2021. "Non-standard errors," Economics Working Papers 1807, Department of Economics and Business, Universitat Pompeu Fabra.
      • Albert J. et al. Menkveld, 2021. "Non-Standard Errors," CESifo Working Paper Series 9453, CESifo.
      • Albert J Menkveld & Anna Dreber & Felix Holzmeister & Juergen Huber & Magnus Johannesson & Michael Kirchler & Sebastian Neusüss & Michael Razen & Utz Weitzel & Gunther Capelle-Blancard & David Abad-Dí, 2021. "Non-Standard Errors," Post-Print halshs-03500882, HAL.
      • Francesco Franzoni & Roxana Mihet & Markus Leippold & Per Ostberg & Olivier Scaillet & Norman Schürhoff & Oksana Bashchenko & Nicola Mano & Michele Pelli, 2022. "Non-Standard Errors," Swiss Finance Institute Research Paper Series 22-09, Swiss Finance Institute.
      • Albert J. Menkveld & Anna Dreber & Felix Holzmeister & Juergen Huber & Magnus Johannesson & Michael Kirchler & Sebastian Neusüss & Michael Razen & Utz Weitzel & Edwin Baidoo & Michael Frömmel & et al, 2021. "Non-Standard Errors," Working Papers of Faculty of Economics and Business Administration, Ghent University, Belgium 21/1032, Ghent University, Faculty of Economics and Business Administration.
      • Menkveld, Albert J. & Dreber, Anna & Holzmeister, Felix & Huber, Juergen & Johannesson, Magnus & Hasse, Jean-Baptiste & e.a.,, 2023. "Non-Standard Errors," LIDAM Reprints LFIN 2023002, Université catholique de Louvain, Louvain Finance (LFIN).
      • Moinas, Sophie & Declerck, Fany & Menkveld, Albert J. & Dreber, Anna, 2023. "Non-Standard Errors," TSE Working Papers 23-1451, Toulouse School of Economics (TSE).
      • Menkveld, A. & Dreber, A. & Holzmeister, F. & Huber, J. & Johannesson, M. & Kirchler, M. & Neusüss, S. & Razen, M. & Neusüss, S. & Neusüss, S., 2021. "Non-Standard Errors," Cambridge Working Papers in Economics 2182, Faculty of Economics, University of Cambridge.
      • Menkveld, Albert J. & Dreber, Anna & Holzmeister, Felix & Huber, Jürgen & Johannesson, Magnus & Kirchler, Michael & Neusüss, Sebastian & Razen, Michael & Weitzel, Utz, 2021. "Non-standard errors," SAFE Working Paper Series 327, Leibniz Institute for Financial Research SAFE.
      • Albert J. Menkveld & Anna Dreber & Felix Holzmeister & Jürgen Huber & Magnus Johannesson & Michael Kirchler & Sebastian Neusüss & Michael Razen & Utz Weitzel & David Abad-Dí­az & Menachem Abudy & Tobi, 2021. "Non-Standard Errors," Working Papers 2021-31, Faculty of Economics and Statistics, Universität Innsbruck.
      • Ferrara, Gerardo & Jurkatis, Simon, 2021. "Non-standard errors," Bank of England working papers 955, Bank of England.
      • Ciril Bosch-Rosa & Bernhard Kassner, 2023. "Non-Standard Errors," Rationality and Competition Discussion Paper Series 385, CRC TRR 190 Rationality and Competition.
      • Albert J Menkveld & Anna Dreber & Felix Holzmeister & Juergen Huber & Magnus Johannesson & Michael Kirchler & Sebastian Neusüss & Michael Razen & Utz Weitzel & Gunther Capelle-Blancard & David Abad-Dí, 2021. "Non-Standard Errors," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) halshs-03500882, HAL.
      • Menkveld, A. & Dreber, A. & Holzmeister, F. & Huber, J. & Johannesson, M. & Kirchler, M. & Neusüss, S. & Razen, M. & Neusüss, S. & Neusüss, S., 2021. "Non-Standard Errors," Janeway Institute Working Papers 2112, Faculty of Economics, University of Cambridge.
      • Wolff, Christian & Menkveld, Albert J. & Dreber, Anna & Holzmeister, Felix & Huber, Juergen & Johannesson, Magnus & Kirchler, Michael & Neusüess, Sebastian & Razen, Michael & Weitzel, Utz, 2021. "Non-Standard Errors," CEPR Discussion Papers 16751, C.E.P.R. Discussion Papers.
    18. Colin F. Camerer & Anna Dreber & Felix Holzmeister & Teck-Hua Ho & Jürgen Huber & Magnus Johannesson & Michael Kirchler & Gideon Nave & Brian A. Nosek & Thomas Pfeiffer & Adam Altmejd & Nick Buttrick , 2018. "Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015," Nature Human Behaviour, Nature, vol. 2(9), pages 637-644, September.
    19. Wu, Lingfei & Kittur, Aniket & Youn, Hyejin & Milojević, Staša & Leahey, Erin & Fiore, Stephen M. & Ahn, Yong-Yeol, 2022. "Metrics and mechanisms: Measuring the unmeasurable in the science of science," Journal of Informetrics, Elsevier, vol. 16(2).
    20. Javier Martínez-Vega & David Rodríguez-Rodríguez, 2022. "Protected Area Effectiveness in the Scientific Literature: A Decade-Long Bibliometric Analysis," Land, MDPI, vol. 11(6), pages 1-14, June.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:taf:amstat:v:72:y:2018:i:1:p:80-88. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Longhurst (email available below). General contact details of provider: http://www.tandfonline.com/UTAS20 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.