IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0103360.html
   My bibliography  Save this article

Outlier Removal and the Relation with Reporting Errors and Quality of Psychological Research

Author

Listed:
  • Marjan Bakker
  • Jelte M Wicherts

Abstract

Background: The removal of outliers to acquire a significant result is a questionable research practice that appears to be commonly used in psychology. In this study, we investigated whether the removal of outliers in psychology papers is related to weaker evidence (against the null hypothesis of no effect), a higher prevalence of reporting errors, and smaller sample sizes in these papers compared to papers in the same journals that did not report the exclusion of outliers from the analyses. Methods and Findings: We retrieved a total of 2667 statistical results of null hypothesis significance tests from 153 articles in main psychology journals, and compared results from articles in which outliers were removed (N = 92) with results from articles that reported no exclusion of outliers (N = 61). We preregistered our hypotheses and methods and analyzed the data at the level of articles. Results show no significant difference between the two types of articles in median p value, sample sizes, or prevalence of all reporting errors, large reporting errors, and reporting errors that concerned the statistical significance. However, we did find a discrepancy between the reported degrees of freedom of t tests and the reported sample size in 41% of articles that did not report removal of any data values. This suggests common failure to report data exclusions (or missingness) in psychological articles. Conclusions: We failed to find that the removal of outliers from the analysis in psychological articles was related to weaker evidence (against the null hypothesis of no effect), sample size, or the prevalence of errors. However, our control sample might be contaminated due to nondisclosure of excluded values in articles that did not report exclusion of outliers. Results therefore highlight the importance of more transparent reporting of statistical analyses.

Suggested Citation

  • Marjan Bakker & Jelte M Wicherts, 2014. "Outlier Removal and the Relation with Reporting Errors and Quality of Psychological Research," PLOS ONE, Public Library of Science, vol. 9(7), pages 1-9, July.
  • Handle: RePEc:plo:pone00:0103360
    DOI: 10.1371/journal.pone.0103360
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0103360
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0103360&type=printable
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Heather A Piwowar & Roger S Day & Douglas B Fridsma, 2007. "Sharing Detailed Research Data Is Associated with Increased Citation Rate," PLOS ONE, Public Library of Science, vol. 2(3), pages 1-5, March.
    2. B. D. McCullough & H. D. Vinod, 2003. "Verifying the Solution from a Nonlinear Solver: A Case Study," American Economic Review, American Economic Association, vol. 93(3), pages 873-892, June.
    3. Caroline J Savage & Andrew J Vickers, 2009. "Empirical Study of Data Sharing by Authors Publishing in PLoS Journals," PLOS ONE, Public Library of Science, vol. 4(9), pages 1-3, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. David Giofrè & Geoff Cumming & Luca Fresc & Ingrid Boedker & Patrizio Tressoldi, 2017. "The influence of journal submission guidelines on authors' reporting of statistics and use of open research practices," PLOS ONE, Public Library of Science, vol. 12(4), pages 1-15, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Nicola Milia & Alessandra Congiu & Paolo Anagnostou & Francesco Montinaro & Marco Capocasa & Emanuele Sanna & Giovanni Destro Bisol, 2012. "Mine, Yours, Ours? Sharing Data on Human Genetic Variation," PLOS ONE, Public Library of Science, vol. 7(6), pages 1-8, June.
    2. Michal Krawczyk & Ernesto Reuben, 2012. "(Un)Available upon Request: Field Experiment on Researchers' Willingness to Share Supplementary Materials," Natural Field Experiments 00689, The Field Experiments Website.
    3. Jillian C Wallis & Elizabeth Rolando & Christine L Borgman, 2013. "If We Share Data, Will Anyone Use Them? Data Sharing and Reuse in the Long Tail of Science and Technology," PLOS ONE, Public Library of Science, vol. 8(7), pages 1-17, July.
    4. Andrew F Magee & Michael R May & Brian R Moore, 2014. "The Dawn of Open Access to Phylogenetic Data," PLOS ONE, Public Library of Science, vol. 9(10), pages 1-10, October.
    5. Stefan Stieglitz & Konstantin Wilms & Milad Mirbabaie & Lennart Hofeditz & Bela Brenger & Ania López & Stephanie Rehwald, 2020. "When are researchers willing to share their data? – Impacts of values and uncertainty on open data in academia," PLOS ONE, Public Library of Science, vol. 15(7), pages 1-20, July.
    6. Bryan T Drew & Romina Gazis & Patricia Cabezas & Kristen S Swithers & Jiabin Deng & Roseana Rodriguez & Laura A Katz & Keith A Crandall & David S Hibbett & Douglas E Soltis, 2013. "Lost Branches on the Tree of Life," PLOS Biology, Public Library of Science, vol. 11(9), pages 1-5, September.
    7. Kurt Lewis, 2009. "The Two-Period Rational Inattention Model: Accelerations and Analyses," Computational Economics, Springer;Society for Computational Economics, vol. 33(1), pages 79-97, February.
    8. Massing, Till & Puente-Ajovín, Miguel & Ramos, Arturo, 2020. "On the parametric description of log-growth rates of cities’ sizes of four European countries and the USA," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 551(C).
    9. Mueller-Langer, Frank & Andreoli-Versbach, Patrick, 2018. "Open access to research data: Strategic delay and the ambiguous welfare effects of mandatory data disclosure," Information Economics and Policy, Elsevier, vol. 42(C), pages 20-34.
    10. Jakusch, Sven Thorsten, 2017. "On the applicability of maximum likelihood methods: From experimental to financial data," SAFE Working Paper Series 148, Leibniz Institute for Financial Research SAFE.
    11. Robert A. Moffitt, 2011. "Report of the Editor: American Economic Review (with Appendix by Philip J. Glandon)," American Economic Review, American Economic Association, vol. 101(3), pages 684-693, May.
    12. Friberg, Richard & Romahn, André, 2015. "Divestiture requirements as a tool for competition policy: A case from the Swedish beer market," International Journal of Industrial Organization, Elsevier, vol. 42(C), pages 1-18.
    13. Vanessa V Sochat & Cameron J Prybol & Gregory M Kurtzer, 2017. "Enhancing reproducibility in scientific computing: Metrics and registry for Singularity containers," PLOS ONE, Public Library of Science, vol. 12(11), pages 1-24, November.
    14. Puente-Ajovin, Miguel & Ramos, Arturo, 2015. "An improvement over the normal distribution for log-growth rates of city sizes: Empirical evidence for France, Germany, Italy and Spain," MPRA Paper 67471, University Library of Munich, Germany.
    15. Ana María Ibáñez Londoño & Juan Carlos Muñoz Mora & Philip Verwimp, 2013. "Abandoning Coffee under the Threat of Violence and the Presence of Illicit Crops. Evidence from Colombia," Documentos CEDE 011465, Universidad de los Andes - CEDE.
    16. Andreoli-Versbach, Patrick & Mueller-Langer, Frank, 2014. "Open access to data: An ideal professed but not practised," Research Policy, Elsevier, vol. 43(9), pages 1621-1633.
    17. Ramos, Arturo, 2019. "Addenda to “Are the log-growth rates of city sizes distributed normally? Empirical evidence for the USA [Empir. Econ. (2017) 53:1109-1123]”," MPRA Paper 93032, University Library of Munich, Germany.
    18. Zeileis, Achim, 2006. "Implementing a class of structural change tests: An econometric computing approach," Computational Statistics & Data Analysis, Elsevier, vol. 50(11), pages 2987-3008, July.
    19. Mark J. McCabe & Frank Mueller-Langer, 2019. "Does Data Disclosure Increase Citations? Empirical Evidence from a Natural Experiment in Leading Economics Journals," JRC Working Papers on Digital Economy 2019-02, Joint Research Centre (Seville site).
    20. Kleiber Christian & Zeileis Achim, 2010. "The Grunfeld Data at 50," German Economic Review, De Gruyter, vol. 11(4), pages 404-417, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0103360. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (plosone). General contact details of provider: https://journals.plos.org/plosone/ .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.