IDEAS home Printed from https://ideas.repec.org/a/spr/psycho/v79y2014i2p210-231.html
   My bibliography  Save this article

Analyses of Model Fit and Robustness. A New Look at the PISA Scaling Model Underlying Ranking of Countries According to Reading Literacy

Author

Listed:
  • Svend Kreiner
  • Karl Christensen

Abstract

This paper addresses methodological issues that concern the scaling model used in the international comparison of student attainment in the Programme for International Student Attainment (PISA), specifically with reference to whether PISA’s ranking of countries is confounded by model misfit and differential item functioning (DIF). To determine this, we reanalyzed the publicly accessible data on reading skills from the 2006 PISA survey. We also examined whether the ranking of countries is robust in relation to the errors of the scaling model. This was done by studying invariance across subscales, and by comparing ranks based on the scaling model and ranks based on models where some of the flaws of PISA’s scaling model are taken into account. Our analyses provide strong evidence of misfit of the PISA scaling model and very strong evidence of DIF. These findings do not support the claims that the country rankings reported by PISA are robust. Copyright The Psychometric Society 2014

Suggested Citation

  • Svend Kreiner & Karl Christensen, 2014. "Analyses of Model Fit and Robustness. A New Look at the PISA Scaling Model Underlying Ranking of Countries According to Reading Literacy," Psychometrika, Springer;The Psychometric Society, vol. 79(2), pages 210-231, April.
  • Handle: RePEc:spr:psycho:v:79:y:2014:i:2:p:210-231
    DOI: 10.1007/s11336-013-9347-z
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1007/s11336-013-9347-z
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1007/s11336-013-9347-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Paul Rosenbaum, 1989. "Criterion-related construct validity," Psychometrika, Springer;The Psychometric Society, vol. 54(4), pages 625-633, September.
    2. Ray Adams & Alla Berezner & Maciej Jakubowski, 2010. "Analysis of PISA 2006 Preferred Items Ranking Using the Percent-Correct Method," OECD Education Working Papers 46, OECD Publishing.
    3. Giorgina Brown & John Micklewright & Sylke V. Schnepf & Robert Waldmann, 2007. "International surveys of educational achievement: how robust are the findings?," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 170(3), pages 623-646, July.
    4. Ivo Molenaar, 1983. "Some improved diagnostics for failure of the Rasch model," Psychometrika, Springer;The Psychometric Society, vol. 48(1), pages 49-72, March.
    5. Hendrikus Kelderman, 1984. "Loglinear Rasch model tests," Psychometrika, Springer;The Psychometric Society, vol. 49(2), pages 223-245, June.
    6. Erling Andersen, 1973. "A goodness of fit test for the rasch model," Psychometrika, Springer;The Psychometric Society, vol. 38(1), pages 123-140, March.
    7. Henk Kelderman, 1989. "Item bias detection using loglinear irt," Psychometrika, Springer;The Psychometric Society, vol. 54(4), pages 681-697, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Krammer, Georg, 2019. "The Andersen likelihood ratio test with a random split criterion lacks power," OSF Preprints gu8sq, Center for Open Science.
    2. Daniele, Vittorio, 2021. "Socioeconomic inequality and regional disparities in educational achievement: The role of relative poverty," Intelligence, Elsevier, vol. 84(C).
    3. Maarten Marsman & Gunter Maris & Timo Bechger & Cees Glas, 2016. "What can we learn from Plausible Values?," Psychometrika, Springer;The Psychometric Society, vol. 81(2), pages 274-289, June.
    4. Giulia Grisolia & Umberto Lucia & Marco Filippo Torchio, 2022. "Sustainable Development and Workers Ability: Considerations on the Education Index in the Human Development Index," Sustainability, MDPI, vol. 14(14), pages 1-18, July.
    5. Engzell, Per, 2017. "What Do Books in the Home Proxy For? A Cautionary Tale," Working Paper Series 1/2016, Stockholm University, Swedish Institute for Social Research.
    6. Robert J. Zwitser & S. Sjoerd F. Glaser & Gunter Maris, 2017. "Monitoring Countries in a Changing World: A New Look at DIF in International Surveys," Psychometrika, Springer;The Psychometric Society, vol. 82(1), pages 210-232, March.
    7. Laura Zieger & John Jerrim & Jake Anders & Nikki Shure, 2020. "Conditioning: How background variables can influence PISA scores," CEPEO Working Paper Series 20-09, UCL Centre for Education Policy and Equalising Opportunities, revised Apr 2020.
    8. Curt Hagquist & Raili Välimaa & Nina Simonsen & Sakari Suominen, 2017. "Differential Item Functioning in Trend Analyses of Adolescent Mental Health – Illustrative Examples Using HBSC-Data from Finland," Child Indicators Research, Springer;The International Society of Child Indicators (ISCI), vol. 10(3), pages 673-691, September.
    9. Alexandra Valéria Sándor, 2020. "Motivations and Self-Perceived Career Prospects of Undergraduate Sociology Students," European Journal of Social Sciences Education and Research Articles, Revistia Research and Publishing, vol. 7, September.
    10. Schnepf, Sylke, 2018. "Insights into survey errors of large scale educational achievement surveys," Working Papers 2018-05, Joint Research Centre, European Commission.
    11. Liu, Ji & Steiner-Khamsi, Gita, 2020. "Human Capital Index and the hidden penalty for non-participation in ILSAs," International Journal of Educational Development, Elsevier, vol. 73(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Clemens Draxler, 2018. "Bayesian conditional inference for Rasch models," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 102(2), pages 245-262, April.
    2. Clemens Draxler, 2010. "Sample Size Determination for Rasch Model Tests," Psychometrika, Springer;The Psychometric Society, vol. 75(4), pages 708-724, December.
    3. Yuguo Chen & Dylan Small, 2005. "Exact tests for the rasch model via sequential importance sampling," Psychometrika, Springer;The Psychometric Society, vol. 70(1), pages 11-30, March.
    4. Svend Kreiner & Karl Christensen, 2011. "Item Screening in Graphical Loglinear Rasch Models," Psychometrika, Springer;The Psychometric Society, vol. 76(2), pages 228-256, April.
    5. Clemens Draxler & Rainer Alexandrowicz, 2015. "Sample Size Determination Within the Scope of Conditional Maximum Likelihood Estimation with Special Focus on Testing the Rasch Model," Psychometrika, Springer;The Psychometric Society, vol. 80(4), pages 897-919, December.
    6. Cees Glas, 1999. "Modification indices for the 2-PL and the nominal response model," Psychometrika, Springer;The Psychometric Society, vol. 64(3), pages 273-294, September.
    7. C. Glas & Anna Dagohoy, 2007. "A Person Fit Test For Irt Models For Polytomous Items," Psychometrika, Springer;The Psychometric Society, vol. 72(2), pages 159-180, June.
    8. Betina Ristorp Andersen & Maria Birkvad Rasmussen & Karl Bang Christensen & Kirsten G Engel & Charlotte Ringsted & Ellen Løkkegaard & Martin G Tolsgaard, 2020. "Making the best of the worst: Care quality during emergency cesarean sections," PLOS ONE, Public Library of Science, vol. 15(2), pages 1-13, February.
    9. Francesco Bartolucci, 2007. "A class of multidimensional IRT models for testing unidimensionality and clustering items," Psychometrika, Springer;The Psychometric Society, vol. 72(2), pages 141-157, June.
    10. Herbert Hoijtink & Ivo Molenaar, 1994. "An item response model with single peaked item characteristic curves: The PARELLA model," Quality & Quantity: International Journal of Methodology, Springer, vol. 28(1), pages 99-116, February.
    11. Tine Nielsen, 2021. "Psychometric evaluation of the Danish language version of the field practice experiences questionnaire for students in teacher education (FPE-DK) using item analysis according to the Rasch model," PLOS ONE, Public Library of Science, vol. 16(10), pages 1-23, October.
    12. Erling Andersen, 1995. "Residualanalysis in the polytomous rasch model," Psychometrika, Springer;The Psychometric Society, vol. 60(3), pages 375-393, September.
    13. C. Schnohr & S. Kreiner & E. Due & C. Currie & W. Boyce & F. Diderichsen, 2008. "Differential Item Functioning of a Family Affluence Scale: Validation Study on Data from HBSC 2001/02," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 89(1), pages 79-95, October.
    14. Karl Klauer, 1991. "An exact and optimal standardized person test for assessing consistency with the rasch model," Psychometrika, Springer;The Psychometric Society, vol. 56(2), pages 213-228, June.
    15. Henk Kelderman & Carl Rijkes, 1994. "Loglinear multidimensional IRT models for polytomously scored items," Psychometrika, Springer;The Psychometric Society, vol. 59(2), pages 149-176, June.
    16. Thorsten Meiser, 1996. "Loglinear Rasch models for the analysis of stability and change," Psychometrika, Springer;The Psychometric Society, vol. 61(4), pages 629-645, December.
    17. Krammer, Georg, 2019. "The Andersen likelihood ratio test with a random split criterion lacks power," OSF Preprints gu8sq, Center for Open Science.
    18. Alberto Maydeu-Olivares & Rosa Montaño, 2013. "How Should We Assess the Fit of Rasch-Type Models? Approximating the Power of Goodness-of-Fit Statistics in Categorical Data Analysis," Psychometrika, Springer;The Psychometric Society, vol. 78(1), pages 116-133, January.
    19. Lionel WILNER, 2019. "The Dynamics of Individual Happiness," Working Papers 2019-18, Center for Research in Economics and Statistics.
    20. Francisco H. G. Ferreira & Jérémie Gignoux, 2014. "The Measurement of Educational Inequality: Achievement and Opportunity," The World Bank Economic Review, World Bank, vol. 28(2), pages 210-246.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:psycho:v:79:y:2014:i:2:p:210-231. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.