IDEAS home Printed from https://ideas.repec.org/p/ehl/lserod/112498.html
   My bibliography  Save this paper

Item pool quality control in educational testing: change point model, compound risk, and sequential detection

Author

Listed:
  • Chen, Yunxiao
  • Lee, Yi-Hsuan
  • Li, Xiaoou

Abstract

In standardized educational testing, test items are reused in multiple test administrations. To ensure the validity of test scores, the psychometric properties of items should remain unchanged over time. In this article, we consider the sequential monitoring of test items, in particular, the detection of abrupt changes to their psychometric properties, where a change can be caused by, for example, leakage of the item or change of the corresponding curriculum. We propose a statistical framework for the detection of abrupt changes in individual items. This framework consists of (1) a multistream Bayesian change point model describing sequential changes in items, (2) a compound risk function quantifying the risk in sequential decisions, and (3) sequential decision rules that control the compound risk. Throughout the sequential decision process, the proposed decision rule balances the trade-off between two sources of errors, the false detection of prechange items, and the nondetection of postchange items. An item-specific monitoring statistic is proposed based on an item response theory model that eliminates the confounding from the examinee population which changes over time. Sequential decision rules and their theoretical properties are developed under two settings: the oracle setting where the Bayesian change point model is completely known and a more realistic setting where some parameters of the model are unknown. Simulation studies are conducted under settings that mimic real operational tests.

Suggested Citation

  • Chen, Yunxiao & Lee, Yi-Hsuan & Li, Xiaoou, 2022. "Item pool quality control in educational testing: change point model, compound risk, and sequential detection," LSE Research Online Documents on Economics 112498, London School of Economics and Political Science, LSE Library.
  • Handle: RePEc:ehl:lserod:112498
    as

    Download full text from publisher

    File URL: http://eprints.lse.ac.uk/112498/
    File Function: Open access version.
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Yi-Hsuan Lee & Charles Lewis, 2021. "Monitoring Item Performance With CUSUM Statistics in Continuous Testing," Journal of Educational and Behavioral Statistics, , vol. 46(5), pages 611-648, October.
    2. Edison M. Choe & Jinming Zhang & Hua-Hua Chang, 2018. "Sequential Detection of Compromised Items Using Response Times in Computerized Adaptive Testing," Psychometrika, Springer;The Psychometric Society, vol. 83(3), pages 650-673, September.
    3. Efron B. & Tibshirani R. & Storey J.D. & Tusher V., 2001. "Empirical Bayes Analysis of a Microarray Experiment," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1151-1160, December.
    4. Yi-Hsuan Lee & Alina Davier, 2013. "Monitoring Scale Scores over Time via Quality Control Charts, Model-Based Approaches, and Time Series Techniques," Psychometrika, Springer;The Psychometric Society, vol. 78(3), pages 557-575, July.
    5. Efron, Bradley, 2004. "Large-Scale Simultaneous Hypothesis Testing: The Choice of a Null Hypothesis," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 96-104, January.
    6. Y. Mei, 2010. "Efficient scalable schemes for monitoring a large number of data streams," Biometrika, Biometrika Trust, vol. 97(2), pages 419-433.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yunxiao Chen & Yi-Hsuan Lee & Xiaoou Li, 2022. "Item Pool Quality Control in Educational Testing: Change Point Model, Compound Risk, and Sequential Detection," Journal of Educational and Behavioral Statistics, , vol. 47(3), pages 322-352, June.
    2. Pounds Stanley B. & Gao Cuilan L. & Zhang Hui, 2012. "Empirical Bayesian Selection of Hypothesis Testing Procedures for Analysis of Sequence Count Expression Data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(5), pages 1-32, October.
    3. Kline, Patrick & Walters, Christopher, 2019. "Audits as Evidence: Experiments, Ensembles, and Enforcement," Institute for Research on Labor and Employment, Working Paper Series qt3z72m9kn, Institute of Industrial Relations, UC Berkeley.
    4. He, Yi & Pan, Wei & Lin, Jizhen, 2006. "Cluster analysis using multivariate normal mixture models to detect differential gene expression with microarray data," Computational Statistics & Data Analysis, Elsevier, vol. 51(2), pages 641-658, November.
    5. Gordon, Alexander & Chen, Linlin & Glazko, Galina & Yakovlev, Andrei, 2009. "Balancing type one and two errors in multiple testing for differential expression of genes," Computational Statistics & Data Analysis, Elsevier, vol. 53(5), pages 1622-1629, March.
    6. Montazeri Zahra & Yanofsky Corey M. & Bickel David R., 2010. "Shrinkage Estimation of Effect Sizes as an Alternative to Hypothesis Testing Followed by Estimation in High-Dimensional Biology: Applications to Differential Gene Expression," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 9(1), pages 1-33, June.
    7. Bickel David R., 2012. "Empirical Bayes Interval Estimates that are Conditionally Equal to Unadjusted Confidence Intervals or to Default Prior Credibility Intervals," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(3), pages 1-34, February.
    8. Hyeon-Ah Kang, 2023. "Sequential Generalized Likelihood Ratio Tests for Online Item Monitoring," Psychometrika, Springer;The Psychometric Society, vol. 88(2), pages 672-696, June.
    9. Leek Jeffrey T & Storey John D., 2011. "The Joint Null Criterion for Multiple Hypothesis Tests," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-22, June.
    10. Khalili, Abbas & Huang, Tim & Lin, Shili, 2009. "A robust unified approach to analyzing methylation and gene expression data," Computational Statistics & Data Analysis, Elsevier, vol. 53(5), pages 1701-1710, March.
    11. Wesley Tansey & Yixin Wang & Raul Rabadan & David Blei, 2020. "Double Empirical Bayes Testing," International Statistical Review, International Statistical Institute, vol. 88(S1), pages 91-113, December.
    12. Campbell R. Harvey & Yan Liu & Heqing Zhu, 2014. ". . . and the Cross-Section of Expected Returns," NBER Working Papers 20592, National Bureau of Economic Research, Inc.
    13. Sairam Rayaprolu & Zhiyi Chi, 2021. "False Discovery Variance Reduction in Large Scale Simultaneous Hypothesis Tests," Methodology and Computing in Applied Probability, Springer, vol. 23(3), pages 711-733, September.
    14. T. Tony Cai & Wenguang Sun & Weinan Wang, 2019. "Covariate‐assisted ranking and screening for large‐scale two‐sample inference," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 81(2), pages 187-234, April.
    15. Yi-Hsuan Lee & Charles Lewis, 2021. "Monitoring Item Performance With CUSUM Statistics in Continuous Testing," Journal of Educational and Behavioral Statistics, , vol. 46(5), pages 611-648, October.
    16. Chang Yu & Daniel Zelterman, 2020. "Distributions associated with simultaneous multiple hypothesis testing," Journal of Statistical Distributions and Applications, Springer, vol. 7(1), pages 1-17, December.
    17. Chen, Yunxiao & Lu, Yan & Moustaki, Irini, 2022. "Detection of two-way outliers in multivariate data and application to cheating detection in educational tests," LSE Research Online Documents on Economics 112499, London School of Economics and Political Science, LSE Library.
    18. Ali Karimnezhad, 2022. "A simple yet efficient method of local false discovery rate estimation designed for genome-wide association data analysis," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 31(1), pages 159-180, March.
    19. Joshua Habiger & Edsel Peña, 2011. "Randomised -values and nonparametric procedures in multiple testing," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 23(3), pages 583-604.
    20. Bradley Efron, 2007. "Doing thousands of hypothesis tests at the same time," Metron - International Journal of Statistics, Dipartimento di Statistica, Probabilità e Statistiche Applicate - University of Rome, vol. 0(1), pages 3-21.

    More about this item

    Keywords

    standardized testing; test security; item preknowledge; sequential change point detection; multi-stream data; compound decision; multistream data; Sage deal;
    All these keywords.

    JEL classification:

    • C1 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ehl:lserod:112498. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: LSERO Manager (email available below). General contact details of provider: https://edirc.repec.org/data/lsepsuk.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.