IDEAS home Printed from https://ideas.repec.org/a/sae/jedbes/v48y2023i3p349-383.html
   My bibliography  Save this article

Assessing Inter-rater Reliability With Heterogeneous Variance Components Models: Flexible Approach Accounting for Contextual Variables

Author

Listed:
  • Patrícia Martinková

    (Institute of Computer Science of the Czech Academy of Sciences, Charles University)

  • FrantiÅ¡ek BartoÅ¡

    (Institute of Computer Science of the Czech Academy of Sciences, University of Amsterdam)

  • Marek Brabec

    (Institute of Computer Science of the Czech Academy of Sciences)

Abstract

Inter-rater reliability (IRR), which is a prerequisite of high-quality ratings and assessments, may be affected by contextual variables, such as the rater’s or ratee’s gender, major, or experience. Identification of such heterogeneity sources in IRR is important for the implementation of policies with the potential to decrease measurement error and to increase IRR by focusing on the most relevant subgroups. In this study, we propose a flexible approach for assessing IRR in cases of heterogeneity due to covariates by directly modeling differences in variance components. We use Bayes factors (BFs) to select the best performing model, and we suggest using Bayesian model averaging as an alternative approach for obtaining IRR and variance component estimates, allowing us to account for model uncertainty. We use inclusion BFs considering the whole model space to provide evidence for or against differences in variance components due to covariates. The proposed method is compared with other Bayesian and frequentist approaches in a simulation study, and we demonstrate its superiority in some situations. Finally, we provide real data examples from grant proposal peer review, demonstrating the usefulness of this method and its flexibility in the generalization of more complex designs.

Suggested Citation

  • Patrícia Martinková & FrantiÅ¡ek BartoÅ¡ & Marek Brabec, 2023. "Assessing Inter-rater Reliability With Heterogeneous Variance Components Models: Flexible Approach Accounting for Contextual Variables," Journal of Educational and Behavioral Statistics, , vol. 48(3), pages 349-383, June.
  • Handle: RePEc:sae:jedbes:v:48:y:2023:i:3:p:349-383
    DOI: 10.3102/10769986221150517
    as

    Download full text from publisher

    File URL: https://journals.sagepub.com/doi/10.3102/10769986221150517
    Download Restriction: no

    File URL: https://libkey.io/10.3102/10769986221150517?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Patrícia Martinková & Dan Goldhaber & Elena Erosheva, 2018. "Disparities in ratings of internal and external applicants: A case for model-based inter-rater reliability," PLOS ONE, Public Library of Science, vol. 13(10), pages 1-17, October.
    2. Tiago M. Fragoso & Wesley Bertoli & Francisco Louzada, 2018. "Bayesian Model Averaging: A Systematic Review and Conceptual Classification," International Statistical Review, International Statistical Institute, vol. 86(1), pages 1-28, April.
    3. Goldhaber, Dan & Grout, Cyrus & Wolff, Malcolm & Martinková, Patrícia, 2021. "Evidence on the Dimensionality and Reliability of Professional References’ Ratings of Teacher Applicants," Economics of Education Review, Elsevier, vol. 83(C).
    4. Rüdiger Mutz & Lutz Bornmann & Hans-Dieter Daniel, 2012. "Heterogeneity of Inter-Rater Reliabilities of Grant Peer Reviews and Its Determinants: A General Estimating Equations Approach," PLOS ONE, Public Library of Science, vol. 7(10), pages 1-10, October.
    5. Jeffrey N. Rouder & Richard D. Morey, 2019. "Teaching Bayes’ Theorem: Strength of Evidence as Predictive Accuracy," The American Statistician, Taylor & Francis Journals, vol. 73(2), pages 186-190, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zhou, W. & O’Neill, E. & Moncaster, A. & Reiner, D. & Guthrie, P., 2019. "Applying Bayesian Model Averaging to Characterise Urban Residential Stock Turnover Dynamics," Cambridge Working Papers in Economics 1986, Faculty of Economics, University of Cambridge.
    2. Roland Brown & Yingling Fan & Kirti Das & Julian Wolfson, 2021. "Iterated multisource exchangeability models for individualized inference with an application to mobile sensor data," Biometrics, The International Biometric Society, vol. 77(2), pages 401-412, June.
    3. He, Ni & Yongqiao, Wang & Tao, Jiang & Zhaoyu, Chen, 2022. "Self-Adaptive bagging approach to credit rating," Technological Forecasting and Social Change, Elsevier, vol. 175(C).
    4. Patrícia Martinková & Dan Goldhaber & Elena Erosheva, 2018. "Disparities in ratings of internal and external applicants: A case for model-based inter-rater reliability," PLOS ONE, Public Library of Science, vol. 13(10), pages 1-17, October.
    5. Abdul Salam & Marco Grzegorczyk, 2023. "Model averaging for sparse seemingly unrelated regression using Bayesian networks among the errors," Computational Statistics, Springer, vol. 38(2), pages 779-808, June.
    6. Emanuel Kopp, 2018. "Determinants of U.S. Business Investment," IMF Working Papers 2018/139, International Monetary Fund.
    7. Liao, Jun & Zou, Guohua, 2020. "Corrected Mallows criterion for model averaging," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
    8. Mark F. J. Steel, 2020. "Model Averaging and Its Use in Economics," Journal of Economic Literature, American Economic Association, vol. 58(3), pages 644-719, September.
    9. Goldhaber, Dan & Grout, Cyrus & Wolff, Malcolm & Martinková, Patrícia, 2021. "Evidence on the Dimensionality and Reliability of Professional References’ Ratings of Teacher Applicants," Economics of Education Review, Elsevier, vol. 83(C).
    10. David G Pina & Darko Hren & Ana Marušić, 2015. "Peer Review Evaluation Process of Marie Curie Actions under EU’s Seventh Framework Programme for Research," PLOS ONE, Public Library of Science, vol. 10(6), pages 1-15, June.
    11. Lutz Bornmann, 2015. "Interrater reliability and convergent validity of F1000Prime peer review," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(12), pages 2415-2426, December.
    12. Díaz, Juan D. & Hansen, Erwin & Cabrera, Gabriel, 2021. "Economic drivers of commodity volatility: The case of copper," Resources Policy, Elsevier, vol. 73(C).
    13. Karmelavičius, Jaunius & Mikaliūnaitė-Jouvanceau, Ieva & Petrokaitė, Austėja Petrokaitė, 2022. "Housing and credit misalignments in a two-market disequilibrium framework," ESRB Working Paper Series 135, European Systemic Risk Board.
    14. Mihai MUTASCU & Nicolae-Bogdan IANC & ALBERT LESSOUA, 2021. "Public debt and inequality in Sub-Saharan Africa: the case of EMCCA and WAEMU countries," LEO Working Papers / DR LEO 2909, Orleans Economics Laboratory / Laboratoire d'Economie d'Orleans (LEO), University of Orleans.
    15. Mathyn Vervaart & Eline Aas & Karl P. Claxton & Mark Strong & Nicky J. Welton & Torbjørn Wisløff & Anna Heath, 2023. "General-Purpose Methods for Simulating Survival Data for Expected Value of Sample Information Calculations," Medical Decision Making, , vol. 43(5), pages 595-609, July.
    16. Ngandu Balekelayi & Solomon Tesfamariam, 2020. "Geoadditive Quantile Regression Model for Sewer Pipes Deterioration Using Boosting Optimization Algorithm," Sustainability, MDPI, vol. 12(20), pages 1-24, October.
    17. Marcin Błażejowski & Jacek Kwiatkowski & Paweł Kufel, 2020. "BACE and BMA Variable Selection and Forecasting for UK Money Demand and Inflation with Gretl," Econometrics, MDPI, vol. 8(2), pages 1-29, May.
    18. Dimitris Korobilis & Kenichi Shimizu, 2022. "Bayesian Approaches to Shrinkage and Sparse Estimation," Foundations and Trends(R) in Econometrics, now publishers, vol. 11(4), pages 230-354, June.
    19. Huihang Liu & Xinyu Zhang, 2023. "Frequentist model averaging for undirected Gaussian graphical models," Biometrics, The International Biometric Society, vol. 79(3), pages 2050-2062, September.
    20. Francisco Alonso & Sergio A. Useche & Eliseo Valle & Cristina Esteban & Javier Gene-Morales, 2021. "Could Road Safety Education (RSE) Help Parents Protect Children? Examining Their Driving Crashes with Children on Board," IJERPH, MDPI, vol. 18(7), pages 1-13, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:sae:jedbes:v:48:y:2023:i:3:p:349-383. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: SAGE Publications (email available below). General contact details of provider: .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.