IDEAS home Printed from https://ideas.repec.org/a/spr/psycho/v87y2022i4d10.1007_s11336-022-09842-0.html
   My bibliography  Save this article

Item Complexity: A Neglected Psychometric Feature of Test Items?

Author

Listed:
  • Daniel M. Bolt

    (University of Wisconsin, Madison)

  • Xiangyi Liao

    (University of Wisconsin, Madison)

Abstract

Despite its frequent consideration in test development, item complexity receives little attention in the psychometric modeling of item response data. In this address, I consider how variability in item complexity can be expected to emerge in the form of item characteristic curve (ICC) asymmetry, and how such effects may significantly influence applications of item response theory, especially those that assume interval-level properties of the latent proficiency metric and groups that vary substantially in mean proficiency. One application is the score gain deceleration phenomenon often observed in vertical scaling contexts, especially in subject areas like math or second language acquisition. It is demonstrated how the application of symmetric IRT models in the presence of complexity-induced positive ICC asymmetry can be a likely cause. A second application concerns the positive correlation between DIF and difficulty commonly seen in verbal proficiency (and other subject area) tests where problem-solving complexity is minimal and proficiency-related guessing effects are likely pronounced. Here we suggest negative ICC asymmetry as a probable cause and apply sensitivity analyses to demonstrate the ease with which such correlations disappear when allowing for negative ICC asymmetry. Unfortunately, the presence of systematic forms of ICC asymmetry is easily missed due to the considerable flexibility afforded by latent trait metrics in IRT. Speculation is provided regarding other applications for which attending to ICC asymmetry may prove useful.

Suggested Citation

  • Daniel M. Bolt & Xiangyi Liao, 2022. "Item Complexity: A Neglected Psychometric Feature of Test Items?," Psychometrika, Springer;The Psychometric Society, vol. 87(4), pages 1195-1213, December.
  • Handle: RePEc:spr:psycho:v:87:y:2022:i:4:d:10.1007_s11336-022-09842-0
    DOI: 10.1007/s11336-022-09842-0
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11336-022-09842-0
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11336-022-09842-0?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Sora Lee & Daniel M. Bolt, 2018. "Asymmetric Item Characteristic Curves and Item Complexity: Insights from Simulation and Real Data Analyses," Psychometrika, Springer;The Psychometric Society, vol. 83(2), pages 453-475, June.
    2. Wendy Yen, 1985. "Increasing item complexity: A possible cause of scale shrinkage for unidimensional item response theory," Psychometrika, Springer;The Psychometric Society, vol. 50(4), pages 399-410, December.
    3. Kevin Lang, 2010. "Measurement Matters: Perspectives on Education Policy from an Economist and School Board Member," Journal of Economic Perspectives, American Economic Association, vol. 24(3), pages 167-182, Summer.
    4. Fumiko Samejima, 2000. "Logistic positive exponent family of models: Virtue of asymmetric item characteristic curves," Psychometrika, Springer;The Psychometric Society, vol. 65(3), pages 319-335, September.
    5. Dale Ballou, 2009. "Test Scaling and Value-Added Measurement," Education Finance and Policy, MIT Press, vol. 4(4), pages 351-383, October.
    6. Weeks, Jonathan P., 2010. "plink: An R Package for Linking Mixed-Format Tests Using IRT-Based Methods," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 35(i12).
    7. Xiangyi Liao & Daniel M. Bolt, 2021. "Item Characteristic Curve Asymmetry: A Better Way to Accommodate Slips and Guesses Than a Four-Parameter Model?," Journal of Educational and Behavioral Statistics, , vol. 46(6), pages 753-775, December.
    8. Dylan Molenaar, 2015. "Heteroscedastic Latent Trait Models for Dichotomous Data," Psychometrika, Springer;The Psychometric Society, vol. 80(3), pages 625-644, September.
    9. Wan, Sirui & Bond, Timothy N. & Lang, Kevin & Clements, Douglas H. & Sarama, Julie & Bailey, Drew H., 2021. "Is intervention fadeout a scaling artefact?," Economics of Education Review, Elsevier, vol. 82(C).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xiangyi Liao & Daniel M. Bolt, 2021. "Item Characteristic Curve Asymmetry: A Better Way to Accommodate Slips and Guesses Than a Four-Parameter Model?," Journal of Educational and Behavioral Statistics, , vol. 46(6), pages 753-775, December.
    2. Koedel Cory & Leatherman Rebecca & Parsons Eric, 2012. "Test Measurement Error and Inference from Value-Added Models," The B.E. Journal of Economic Analysis & Policy, De Gruyter, vol. 12(1), pages 1-37, November.
    3. Seth Gershenson, 2016. "Performance Standards and Employee Effort: Evidence From Teacher Absences," Journal of Policy Analysis and Management, John Wiley & Sons, Ltd., vol. 35(3), pages 615-638, June.
    4. Sonia Bhalotra & Martin Karlsson & Therese Nilsson & Nina Schwarz, 2022. "Infant Health, Cognitive Performance, and Earnings: Evidence from Inception of the Welfare State in Sweden," The Review of Economics and Statistics, MIT Press, vol. 104(6), pages 1138-1156, November.
    5. Nirav Mehta, 2019. "Measuring quality for use in incentive schemes: The case of “shrinkage” estimators," Quantitative Economics, Econometric Society, vol. 10(4), pages 1537-1577, November.
    6. Sora Lee & Daniel M. Bolt, 2018. "Asymmetric Item Characteristic Curves and Item Complexity: Insights from Simulation and Real Data Analyses," Psychometrika, Springer;The Psychometric Society, vol. 83(2), pages 453-475, June.
    7. Peter Bickel & Steven Buyske & Huahua Chang & Zhiliang Ying, 2001. "On maximizing item information and matching difficulty with ability," Psychometrika, Springer;The Psychometric Society, vol. 66(1), pages 69-77, March.
    8. Derek C. Briggs & Ben Domingue, 2013. "The Gains From Vertical Scaling," Journal of Educational and Behavioral Statistics, , vol. 38(6), pages 551-576, December.
    9. Alexander Robitzsch, 2020. "L p Loss Functions in Invariance Alignment and Haberman Linking with Few or Many Groups," Stats, MDPI, vol. 3(3), pages 1-38, August.
    10. Dylan Molenaar, 2015. "Heteroscedastic Latent Trait Models for Dichotomous Data," Psychometrika, Springer;The Psychometric Society, vol. 80(3), pages 625-644, September.
    11. Gadi Barlevy & Derek Neal, 2012. "Pay for Percentile," American Economic Review, American Economic Association, vol. 102(5), pages 1805-1831, August.
    12. Donald Boyd & Hamilton Lankford & Susanna Loeb & James Wyckoff, 2013. "Measuring Test Measurement Error," Journal of Educational and Behavioral Statistics, , vol. 38(6), pages 629-663, December.
    13. Brendan Houng & Moshe Justman, 2013. "Comparing Least-Squares Value-Added Analysis and Student Growth Percentile Analysis for Evaluating Student Progress and Estimating School Effects," Melbourne Institute Working Paper Series wp2013n07, Melbourne Institute of Applied Economic and Social Research, The University of Melbourne.
    14. David M. Quinn & Andrew D. Ho, 2021. "Ordinal Approaches to Decomposing Between-Group Test Score Disparities," Journal of Educational and Behavioral Statistics, , vol. 46(4), pages 466-500, August.
    15. J. R. Lockwood & Daniel F. McCaffrey, 2014. "Correcting for Test Score Measurement Error in ANCOVA Models for Estimating Treatment Effects," Journal of Educational and Behavioral Statistics, , vol. 39(1), pages 22-52, February.
    16. Timothy N. Bond & Kevin Lang, 2013. "The Evolution of the Black-White Test Score Gap in Grades K–3: The Fragility of Results," The Review of Economics and Statistics, MIT Press, vol. 95(5), pages 1468-1479, December.
    17. Cory Koedel & Mark Ehlert & Eric Parsons & Michael Podgursky, 2012. "Selecting Growth Measures for School and Teacher Evaluations," Working Papers 1210, Department of Economics, University of Missouri.
    18. Rikkert M. van der Lans & Ridwan Maulana & Michelle Helms-Lorenz & Carmen-María Fernández-García & Seyeoung Chun & Thelma de Jager & Yulia Irnidayanti & Mercedes Inda-Caro & Okhwa Lee & Thys Coetze, 2021. "Student Perceptions of Teaching Quality in Five Countries: A Partial Credit Model Approach to Assess Measurement Invariance," SAGE Open, , vol. 11(3), pages 21582440211, August.
    19. Schwerdt, Guido & West, Martin R. & Winters, Marcus A., 2017. "The effects of test-based retention on student outcomes over time: Regression discontinuity evidence from Florida," Journal of Public Economics, Elsevier, vol. 152(C), pages 154-169.
    20. González, Jorge, 2014. "SNSequate: Standard and Nonstandard Statistical Models and Methods for Test Equating," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 59(i07).

    More about this item

    Keywords

    Item complexity; Item response theory;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:psycho:v:87:y:2022:i:4:d:10.1007_s11336-022-09842-0. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.