IDEAS home Printed from https://ideas.repec.org/a/spr/psycho/v89y2024i2d10.1007_s11336-023-09945-2.html
   My bibliography  Save this article

Measures of Agreement with Multiple Raters: Fréchet Variances and Inference

Author

Listed:
  • Jonas Moss

    (BI Norwegian Business School)

Abstract

Most measures of agreement are chance-corrected. They differ in three dimensions: their definition of chance agreement, their choice of disagreement function, and how they handle multiple raters. Chance agreement is usually defined in a pairwise manner, following either Cohen’s kappa or Fleiss’s kappa. The disagreement function is usually a nominal, quadratic, or absolute value function. But how to handle multiple raters is contentious, with the main contenders being Fleiss’s kappa, Conger’s kappa, and Hubert’s kappa, the variant of Fleiss’s kappa where agreement is said to occur only if every rater agrees. More generally, multi-rater agreement coefficients can be defined in a g-wise way, where the disagreement weighting function uses g raters instead of two. This paper contains two main contributions. (a) We propose using Fréchet variances to handle the case of multiple raters. The Fréchet variances are intuitive disagreement measures and turn out to generalize the nominal, quadratic, and absolute value functions to the case of more than two raters. (b) We derive the limit theory of g-wise weighted agreement coefficients, with chance agreement of the Cohen-type or Fleiss-type, for the case where every item is rated by the same number of raters. Trying out three confidence interval constructions, we end up recommending calculating confidence intervals using the arcsine transform or the Fisher transform.

Suggested Citation

  • Jonas Moss, 2024. "Measures of Agreement with Multiple Raters: Fréchet Variances and Inference," Psychometrika, Springer;The Psychometric Society, vol. 89(2), pages 517-541, June.
  • Handle: RePEc:spr:psycho:v:89:y:2024:i:2:d:10.1007_s11336-023-09945-2
    DOI: 10.1007/s11336-023-09945-2
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11336-023-09945-2
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11336-023-09945-2?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jonas Moss, 2023. "Measuring Agreement Using Guessing Models and Knowledge Coefficients," Psychometrika, Springer;The Psychometric Society, vol. 88(3), pages 1002-1025, September.
    2. Christof Schuster & David Smith, 2005. "Dispersion-weighted kappa: An integrative framework for metric and nominal scale agreement coefficients," Psychometrika, Springer;The Psychometric Society, vol. 70(1), pages 135-146, March.
    3. Paromita Dubey & Hans-Georg Müller, 2019. "Fréchet analysis of variance for random objects," Biometrika, Biometrika Trust, vol. 106(4), pages 803-821.
    4. Bruce Cooil & Roland Rust, 1994. "Reliability and expected loss: A unifying principle," Psychometrika, Springer;The Psychometric Society, vol. 59(2), pages 203-216, June.
    5. Josep L. Carrasco & Lluís Jover, 2003. "Estimating the Generalized Concordance Correlation Coefficient through Variance Components," Biometrics, The International Biometric Society, vol. 59(4), pages 849-858, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. A. Martín Andrés & M. Álvarez Hernández, 2025. "Estimators of various kappa coefficients based on the unbiased estimator of the expected index of agreements," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 19(1), pages 177-207, March.
    2. Helenowski Irene B & Vonesh Edward F & Demirtas Hakan & Rademaker Alfred W & Ananthanarayanan Vijayalakshmi & Gann Peter H & Jovanovic Borko D, 2011. "Defining Reproducibility Statistics as a Function of the Spatial Covariance Structures in Biomarker Studies," The International Journal of Biostatistics, De Gruyter, vol. 7(1), pages 1-21, January.
    3. Arthur Pewsey & Eduardo García-Portugués, 2021. "Rejoinder on: Recent advances in directional statistics," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(1), pages 76-82, March.
    4. Bulté, Matthieu & Sørensen, Helle, 2024. "Medoid splits for efficient random forests in metric spaces," Computational Statistics & Data Analysis, Elsevier, vol. 198(C).
    5. Janice L. Scealy, 2021. "Comments on: Recent advances in directional statistics," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(1), pages 68-70, March.
    6. Balakrishnan, Narayanaswamy & Ristić, Miroslav M., 2016. "Multivariate families of gamma-generated distributions with finite or infinite support above or below the diagonal," Journal of Multivariate Analysis, Elsevier, vol. 143(C), pages 194-207.
    7. Jason Wittenberg, 2013. "How similar are they? rethinking electoral congruence," Quality & Quantity: International Journal of Methodology, Springer, vol. 47(3), pages 1687-1701, April.
    8. Geòrgia Escaramís & Josep L. Carrasco & Carlos Ascaso, 2008. "Detection of Significant Disease Risks Using a Spatial Conditional Autoregressive Model," Biometrics, The International Biometric Society, vol. 64(4), pages 1043-1053, December.
    9. Tsai, Miao-Yu, 2015. "Comparison of concordance correlation coefficient via variance components, generalized estimating equations and weighted approaches with model selection," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 47-58.
    10. Erkan Cer, 2019. "The Instruction of Writing Strategies: The Effect of the Metacognitive Strategy on the Writing Skills of Pupils in Secondary Education," SAGE Open, , vol. 9(2), pages 21582440198, April.
    11. Chen, Chia-Cheng & Barnhart, Huiman X., 2008. "Comparison of ICC and CCC for assessing agreement for data without and with replications," Computational Statistics & Data Analysis, Elsevier, vol. 53(2), pages 554-564, December.
    12. Tsai, Miao-Yu & Lin, Chao-Chun, 2018. "Concordance correlation coefficients estimated by variance components for longitudinal normal and Poisson data," Computational Statistics & Data Analysis, Elsevier, vol. 121(C), pages 57-70.
    13. Matthijs Warrens, 2014. "Corrected Zegers-ten Berge Coefficients Are Special Cases of Cohen’s Weighted Kappa," Journal of Classification, Springer;The Classification Society, vol. 31(2), pages 179-193, July.
    14. Bruce Cooil & Roland Rust, 1995. "General estimators for the reliability of qualitative data," Psychometrika, Springer;The Psychometric Society, vol. 60(2), pages 199-220, June.
    15. Anton Oleinik, 2024. "A Bayesian index of association: comparison with other measures and performance," Quality & Quantity: International Journal of Methodology, Springer, vol. 58(1), pages 277-305, February.
    16. Adam Duhachek & Anne T. Coughlan & Dawn Iacobucci, 2005. "Results on the Standard Error of the Coefficient Alpha Index of Reliability," Marketing Science, INFORMS, vol. 24(2), pages 294-301, July.
    17. Josep L. Carrasco, 2010. "A Generalized Concordance Correlation Coefficient Based on the Variance Components Generalized Linear Mixed Models for Overdispersed Count Data," Biometrics, The International Biometric Society, vol. 66(3), pages 897-904, September.
    18. Alexandra Raadt & Matthijs J. Warrens & Roel J. Bosker & Henk A. L. Kiers, 2021. "A Comparison of Reliability Coefficients for Ordinal Rating Scales," Journal of Classification, Springer;The Classification Society, vol. 38(3), pages 519-543, October.

    More about this item

    Keywords

    agreement; inter-rater reliability; AC1; Cohen kappa;
    All these keywords.

    JEL classification:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:psycho:v:89:y:2024:i:2:d:10.1007_s11336-023-09945-2. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.