Author
Listed:
- Ryan Liu
- Steven Jecmen
- Vincent Conitzer
- Fei Fang
- Nihar B Shah
Abstract
Objective: Peer review frequently follows a process where reviewers first provide initial reviews, authors respond to these reviews, then reviewers update their reviews based on the authors’ response. There is mixed evidence regarding whether this process is useful, including frequent anecdotal complaints that reviewers insufficiently update their scores. In this study, we aim to investigate whether reviewers anchor to their original scores when updating their reviews, which serves as a potential explanation for the lack of updates in reviewer scores. Design: We design a novel randomized controlled trial to test if reviewers exhibit anchoring. In the experimental condition, participants initially see a flawed version of a paper that is corrected after they submit their initial review, while in the control condition, participants only see the correct version. We take various measures to ensure that in the absence of anchoring, reviewers in the experimental group should revise their scores to be identically distributed to the scores from the control group. Furthermore, we construct the reviewed paper to maximize the difference between the flawed and corrected versions, and employ deception to hide the true experiment purpose. Results: Our randomized controlled trial consists of 108 researchers as participants. First, we find that our intervention was successful at creating a difference in perceived paper quality between the flawed and corrected versions: Using a permutation test with the Mann-Whitney U statistic, we find that the experimental group’s initial scores are lower than the control group’s scores in both the Evaluation category (Vargha-Delaney A = 0.64, p = 0.0096) and Overall score (A = 0.59, p = 0.058). Next, we test for anchoring by comparing the experimental group’s revised scores with the control group’s scores. We find no significant evidence of anchoring in either the Overall (A = 0.50, p = 0.61) or Evaluation category (A = 0.49, p = 0.61). The Mann-Whitney U represents the number of individual pairwise comparisons across groups in which the value from the specified group is stochastically greater, while the Vargha-Delaney A is the normalized version in [0, 1].
Suggested Citation
Ryan Liu & Steven Jecmen & Vincent Conitzer & Fei Fang & Nihar B Shah, 2024.
"Testing for reviewer anchoring in peer review: A randomized controlled trial,"
PLOS ONE, Public Library of Science, vol. 19(11), pages 1-19, November.
Handle:
RePEc:plo:pone00:0301111
DOI: 10.1371/journal.pone.0301111
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0301111. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.