IDEAS home Printed from https://ideas.repec.org/p/ajk/ajkdps/384.html

When Algorithms Rate Performance: Do Large Language Models Replicate Human Evaluation Biases?

Author

Listed:
  • Rainer Michael Rilke

    (WHU - Otto Beisheim School of Management)

  • Dirk Sliwka

    (University of Cologne)

Abstract

A large body of research across management, psychology, accounting, and economics shows that subjective performance evaluations are systematically biased: ratings cluster near the midpoint of scales and are often excessively lenient. As organizations increasingly adopt large language models (LLMs) for evaluative tasks, little is known about how these systems perform when assessing human performance. We document that, in the absence of clear objective standards and when individuals are rated independently, LLMs reproduce the familiar patterns of human raters. However, LLMs generate greater dispersion and accuracy when evaluating multiple individuals simultaneously. With noisy but objective performance signals, LLMs provide substantially more accurate evaluations than human raters, as they (i) are less subject to biases arising from concern for the evaluated employee and (ii) make fewer mistakes in information processing closely approximating rational Bayesian benchmarks.

Suggested Citation

  • Rainer Michael Rilke & Dirk Sliwka, 2026. "When Algorithms Rate Performance: Do Large Language Models Replicate Human Evaluation Biases?," ECONtribute Discussion Papers Series 384, University of Bonn and University of Cologne, Germany.
  • Handle: RePEc:ajk:ajkdps:384
    as

    Download full text from publisher

    File URL: https://www.econtribute.de/RePEc/ajk/ajkdps/ECONtribute_384_2026.pdf
    File Function: First version, 2026
    Download Restriction: no
    ---><---

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;

    JEL classification:

    • J24 - Labor and Demographic Economics - - Demand and Supply of Labor - - - Human Capital; Skills; Occupational Choice; Labor Productivity
    • J28 - Labor and Demographic Economics - - Demand and Supply of Labor - - - Safety; Job Satisfaction; Related Public Policy
    • M12 - Business Administration and Business Economics; Marketing; Accounting; Personnel Economics - - Business Administration - - - Personnel Management; Executives; Executive Compensation
    • M53 - Business Administration and Business Economics; Marketing; Accounting; Personnel Economics - - Personnel Economics - - - Training

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ajk:ajkdps:384. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ECONtribute Office (email available below). General contact details of provider: https://www.econtribute.de .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.