IDEAS home Printed from https://ideas.repec.org/a/vrs/offsta/v33y2017i4p921-962n5.html
   My bibliography  Save this article

Estimating Classification Errors Under Edit Restrictions in Composite Survey-Register Data Using Multiple Imputation Latent Class Modelling (MILC)

Author

Listed:
  • Boeschoten Laura

    (Tilburg UniversityTilburg School of Social and Behavioral Sciences – Methodology and Statistics, PO Box 90153, Tilburg5000 LE, Netherlands and Centraal Bureau voor de Statistiek – Process development and methodology Henri Faasdreef 312, Den Haag 2492 JP, The Netherlands.)

  • Oberski Daniel

    (Universiteit Utrecht – Social and Behavioural Sciences, Utrecht, Utrecht, The Netherlands and Tilburg University Tilburg School of Social and Behavioral Sciences – Methodology and Statistics, Tilburg, The Netherlands.)

  • de Waal Ton

    (Centraal Bureau voor de Statistiek – Process development and methodology Den Haag, The Netherlands and Tilburg University Tilburg School of Social and Behavioral Sciences – Methodology and Statistics, Tilburg, The Netherlands.)

Abstract

Both registers and surveys can contain classification errors. These errors can be estimated by making use of a composite data set. We propose a new method based on latent class modelling to estimate the number of classification errors across several sources while taking into account impossible combinations with scores on other variables. Furthermore, the latent class model, by multiply imputing a new variable, enhances the quality of statistics based on the composite data set. The performance of this method is investigated by a simulation study, which shows that whether or not the method can be applied depends on the entropy R2 of the latent class model and the type of analysis a researcher is planning to do. Finally, the method is applied to public data from Statistics Netherlands.

Suggested Citation

  • Boeschoten Laura & Oberski Daniel & de Waal Ton, 2017. "Estimating Classification Errors Under Edit Restrictions in Composite Survey-Register Data Using Multiple Imputation Latent Class Modelling (MILC)," Journal of Official Statistics, Sciendo, vol. 33(4), pages 921-962, December.
  • Handle: RePEc:vrs:offsta:v:33:y:2017:i:4:p:921-962:n:5
    DOI: 10.1515/jos-2017-0044
    as

    Download full text from publisher

    File URL: https://doi.org/10.1515/jos-2017-0044
    Download Restriction: no

    File URL: https://libkey.io/10.1515/jos-2017-0044?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Forcina, Antonio, 2008. "Identifiability of extended latent class models with individual covariates," Computational Statistics & Data Analysis, Elsevier, vol. 52(12), pages 5263-5268, August.
    2. Li‐Chun Zhang, 2012. "Topics of statistical theory for register‐based statistics and data integration," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 66(1), pages 41-63, February.
    3. Daniël W. Palm & L. Andries Ark & Jeroen K. Vermunt, 2016. "Divisive Latent Class Modeling as a Density Estimation Method for Categorical Data," Journal of Classification, Springer;The Classification Society, vol. 33(1), pages 52-72, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. L. Boeschoten & M. A. Croon & D. L. Oberski, 2019. "A Note on Applying the BCH Method Under Linear Equality and Inequality Constraints," Journal of Classification, Springer;The Classification Society, vol. 36(3), pages 566-575, October.
    2. Peter G. M. van der Heijden & Maarten Cruyff & Paul A. Smith & Christine Bycroft & Patrick Graham & Nathaniel Matheson‐Dunning, 2022. "Multiple system estimation using covariates having missing values and measurement error: Estimating the size of the Māori population in New Zealand," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(1), pages 156-177, January.
    3. Alexandru Cernat & Daniel L. Oberski, 2022. "Estimating stochastic survey response errors using the multitrait‐multierror model," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(1), pages 134-155, January.
    4. Ton de Waal & Arnout van Delden & Sander Scholtus, 2020. "Multi‐source Statistics: Basic Situations and Methods," International Statistical Review, International Statistical Institute, vol. 88(1), pages 203-228, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bakker Bart F.M. & Heijden Peter G.M. van der & Scholtus Sander, 2015. "Preface," Journal of Official Statistics, Sciendo, vol. 31(3), pages 349-355, September.
    2. Fulvia Cerroni & Grazia Di Bella & Lorena Galiè, 2014. "Evaluating administrative data quality as inputof the statistical production process," Rivista di statistica ufficiale, ISTAT - Italian National Institute of Statistics - (Rome, ITALY), vol. 16(1-2), pages 117-146.
    3. Fabrizio Antolini & Laura Grassini, 2020. "Methodological problems in the economic measurement of tourism: the need for new sources of information," Quality & Quantity: International Journal of Methodology, Springer, vol. 54(5), pages 1769-1780, December.
    4. Elżbieta Gołata, 2016. "Shift In Methodology And Population Census Quality," Statistics in Transition New Series, Polish Statistical Association, vol. 17(4), pages 631-658, December.
    5. Dardanoni, V & Li Donni, P, 2008. "Testing For Asymmetric Information In Insurance Markets With Unobservable Types," Health, Econometrics and Data Group (HEDG) Working Papers 08/26, HEDG, c/o Department of Economics, University of York.
    6. Li-Chun Zhang & Ib Thomsen & Øyvin Kleven, 2013. "On the Use of Auxiliary and Paradata for Dealing With Non-sampling Errors in Household Surveys," International Statistical Review, International Statistical Institute, vol. 81(2), pages 270-288, August.
    7. Roberto Colombi & Sabrina Giordano, 2019. "Likelihood-based tests for a class of misspecified finite mixture models for ordinal categorical data," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(4), pages 1175-1202, December.
    8. Dardanoni, Valentino & Li Donni, Paolo, 2012. "Incentive and selection effects of Medigap insurance on inpatient care," Journal of Health Economics, Elsevier, vol. 31(3), pages 457-470.
    9. Douglas L. Steinley, 2018. "Editorial," Journal of Classification, Springer;The Classification Society, vol. 35(1), pages 1-4, April.
    10. Ton de Waal & Arnout van Delden & Sander Scholtus, 2020. "Multi‐source Statistics: Basic Situations and Methods," International Statistical Review, International Statistical Institute, vol. 88(1), pages 203-228, April.
    11. Elżbieta Gołata, 2015. "Sae Education Challenges To Academics And Nsi," Statistics in Transition New Series, Polish Statistical Association, vol. 16(4), pages 611-630, December.
    12. van Delden Arnout & Lorenc Boris & Struijs Peter & Zhang Li-Chun, 2018. "Letter to the Editor," Journal of Official Statistics, Sciendo, vol. 34(2), pages 573-580, June.
    13. Alfonso Carfora & Giuseppe Scandurra & Antonio Thomas, 2021. "Determinants of environmental innovations supporting small‐ and medium‐sized enterprises sustainable development," Business Strategy and the Environment, Wiley Blackwell, vol. 30(5), pages 2621-2636, July.
    14. Silvia Biffignandi & Alessandro Zeli, 2021. "Longitudinal business data construction and quality: Two different approaches," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 75(2), pages 92-114, May.
    15. Beręsewicz Maciej, 2019. "Correlates of Representation Errors in Internet Data Sources for Real Estate Market," Journal of Official Statistics, Sciendo, vol. 35(3), pages 509-529, September.
    16. Gołata Elżbieta, 2015. "Sae Education Challenges to Academics and NSI," Statistics in Transition New Series, Polish Statistical Association, vol. 16(4), pages 611-630, December.
    17. Paolo Li Donni & Ranjeeta Thomas, 2020. "Latent class models for multiple ordered categorical health data: testing violation of the local independence assumption," Empirical Economics, Springer, vol. 59(4), pages 1903-1931, October.
    18. Daniel Oberski & Geert Kollenburg & Jeroen Vermunt, 2013. "A Monte Carlo evaluation of three methods to detect local dependence in binary data latent class models," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 7(3), pages 267-279, September.
    19. Maciej Berk{e}sewicz & Greta Bia{l}kowska & Krzysztof Marcinkowski & Magdalena Ma'slak & Piotr Opiela & Robert Pater & Katarzyna Zadroga, 2019. "Enhancing the Demand for Labour survey by including skills from online job advertisements using model-assisted calibration," Papers 1908.06731, arXiv.org.
    20. Fabrizio Antolini & Laura Grassini, 2020. "Issues in Tourism Statistics: A Critical Review," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 150(3), pages 1021-1042, August.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:vrs:offsta:v:33:y:2017:i:4:p:921-962:n:5. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.sciendo.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.