IDEAS home Printed from https://ideas.repec.org/a/spr/psycho/v88y2023i4d10.1007_s11336-022-09888-0.html
   My bibliography  Save this article

A Bayesian Approach Towards Missing Covariate Data in Multilevel Latent Regression Models

Author

Listed:
  • Christian Aßmann

    (Leibniz Institute for Educational Trajectories Bamberg
    Otto-Friedrich-Universität Bamberg)

  • Jean-Christoph Gaasch

    (Otto-Friedrich-Universität Bamberg)

  • Doris Stingl

    (Otto-Friedrich-Universität Bamberg)

Abstract

The measurement of latent traits and investigation of relations between these and a potentially large set of explaining variables is typical in psychology, economics, and the social sciences. Corresponding analysis often relies on surveyed data from large-scale studies involving hierarchical structures and missing values in the set of considered covariates. This paper proposes a Bayesian estimation approach based on the device of data augmentation that addresses the handling of missing values in multilevel latent regression models. Population heterogeneity is modeled via multiple groups enriched with random intercepts. Bayesian estimation is implemented in terms of a Markov chain Monte Carlo sampling approach. To handle missing values, the sampling scheme is augmented to incorporate sampling from the full conditional distributions of missing values. We suggest to model the full conditional distributions of missing values in terms of non-parametric classification and regression trees. This offers the possibility to consider information from latent quantities functioning as sufficient statistics. A simulation study reveals that this Bayesian approach provides valid inference and outperforms complete cases analysis and multiple imputation in terms of statistical efficiency and computation time involved. An empirical illustration using data on mathematical competencies demonstrates the usefulness of the suggested approach.

Suggested Citation

  • Christian Aßmann & Jean-Christoph Gaasch & Doris Stingl, 2023. "A Bayesian Approach Towards Missing Covariate Data in Multilevel Latent Regression Models," Psychometrika, Springer;The Psychometric Society, vol. 88(4), pages 1495-1528, December.
  • Handle: RePEc:spr:psycho:v:88:y:2023:i:4:d:10.1007_s11336-022-09888-0
    DOI: 10.1007/s11336-022-09888-0
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11336-022-09888-0
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11336-022-09888-0?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Aeilko Zwinderman, 1991. "A generalized rasch model for manifest predictors," Psychometrika, Springer;The Psychometric Society, vol. 56(4), pages 589-600, December.
    2. William Greene, 2004. "Convenient estimators for the panel probit model: Further results," Empirical Economics, Springer, vol. 29(1), pages 21-47, January.
    3. Aßmann, Christian & Boysen-Hogrefe, Jens, 2011. "A Bayesian approach to model-based clustering for binary panel probit models," Computational Statistics & Data Analysis, Elsevier, vol. 55(1), pages 261-279, January.
    4. Michael Edwards, 2010. "A Markov Chain Monte Carlo Approach to Confirmatory Item Factor Analysis," Psychometrika, Springer;The Psychometric Society, vol. 75(3), pages 474-497, September.
    5. Imai, Kosuke & van Dyk, David A., 2005. "A Bayesian analysis of the multinomial probit model using marginal data augmentation," Journal of Econometrics, Elsevier, vol. 124(2), pages 311-334, February.
    6. William Greene, 2004. "The behaviour of the maximum likelihood estimator of limited dependent variable models in the presence of fixed effects," Econometrics Journal, Royal Economic Society, vol. 7(1), pages 98-119, June.
    7. Minzhi Liu & Jeremy M. G. Taylor & Thomas R. Belin, 2000. "Multiple Imputation and Posterior Simulation for Multivariate Missing Data in Longitudinal Studies," Biometrics, The International Biometric Society, vol. 56(4), pages 1157-1163, December.
    8. Jean-Paul Fox & Cees Glas, 2001. "Bayesian estimation of a multilevel IRT model using gibbs sampling," Psychometrika, Springer;The Psychometric Society, vol. 66(2), pages 271-288, June.
    9. Magnus Carlsson & Gordon B. Dahl & Björn Öckert & Dan-Olof Rooth, 2015. "The Effect of Schooling on Cognitive Skills," The Review of Economics and Statistics, MIT Press, vol. 97(3), pages 533-547, July.
    10. Veronika Ročková & Edward I. George, 2018. "The Spike-and-Slab LASSO," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(521), pages 431-444, January.
    11. Bengt Muthén & David Kaplan & Michael Hollis, 1987. "On structural equation modeling with data that are not missing completely at random," Psychometrika, Springer;The Psychometric Society, vol. 52(3), pages 431-462, September.
    12. Matthew Blackwell & James Honaker & Gary King, 2017. "A Unified Approach to Measurement Error and Missing Data: Details and Extensions," Sociological Methods & Research, , vol. 46(3), pages 342-369, August.
    13. Christian Aßmann & Marcel Preising, 2020. "Bayesian estimation and model comparison for linear dynamic panel models with missing values," Australian & New Zealand Journal of Statistics, Australian Statistical Publishing Association Inc., vol. 62(4), pages 536-557, December.
    14. Thomas Cornelissen & Christian Dustmann, 2019. "Early School Exposure, Test Scores, and Noncognitive Outcomes," American Economic Journal: Economic Policy, American Economic Association, vol. 11(2), pages 35-63, May.
    15. Doove, L.L. & Van Buuren, S. & Dusseldorp, E., 2014. "Recursive partitioning for missing data imputation in the presence of interaction effects," Computational Statistics & Data Analysis, Elsevier, vol. 72(C), pages 92-104.
    16. Jean-Francois Richard, 2007. "Efficient High-Dimensional Importance Sampling," Working Paper 321, Department of Economics, University of Pittsburgh, revised Jan 2007.
    17. Chib S. & Jeliazkov I., 2001. "Marginal Likelihood From the Metropolis-Hastings Output," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 270-281, March.
    18. Frederic Lord, 1953. "An application of confidence intervals and of maximum likelihood to the estimation of an examinee's ability," Psychometrika, Springer;The Psychometric Society, vol. 18(1), pages 57-76, March.
    19. Lancaster, Tony, 2000. "The incidental parameter problem since 1948," Journal of Econometrics, Elsevier, vol. 95(2), pages 391-413, April.
    20. Harvey Goldstein & James R. Carpenter & William J. Browne, 2014. "Fitting multilevel multivariate models with missing data in responses and covariates that may include interactions and non-linear terms," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 177(2), pages 553-564, February.
    21. Bengt Muthén & Anders Christoffersson, 1981. "Simultaneous factor analysis of dichotomous variables in several groups," Psychometrika, Springer;The Psychometric Society, vol. 46(4), pages 407-419, December.
    22. Richard, Jean-Francois & Zhang, Wei, 2007. "Efficient high-dimensional importance sampling," Journal of Econometrics, Elsevier, vol. 141(2), pages 1385-1411, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Aßmann, Christian & Boysen-Hogrefe, Jens, 2011. "A Bayesian approach to model-based clustering for binary panel probit models," Computational Statistics & Data Analysis, Elsevier, vol. 55(1), pages 261-279, January.
    2. Michael Bergrab & Christian Aßmann, 2024. "Automated Bayesian variable selection methods for binary regression models with missing covariate data," AStA Wirtschafts- und Sozialstatistisches Archiv, Springer;Deutsche Statistische Gesellschaft - German Statistical Society, vol. 18(2), pages 203-244, June.
    3. Moura, Guilherme V. & Richard, Jean-François & Liesenfeld, Roman, 2007. "Dynamic Panel Probit Models for Current Account Reversals and their Efficient Estimation," Economics Working Papers 2007-11, Christian-Albrechts-University of Kiel, Department of Economics.
    4. Patil, Priyadarshan N. & Dubey, Subodh K. & Pinjari, Abdul R. & Cherchi, Elisabetta & Daziano, Ricardo & Bhat, Chandra R., 2017. "Simulation evaluation of emerging estimation techniques for multinomial probit models," Journal of choice modelling, Elsevier, vol. 23(C), pages 9-20.
    5. Zhehan Jiang & Jonathan Templin, 2019. "Gibbs Samplers for Logistic Item Response Models via the Pólya–Gamma Distribution: A Computationally Efficient Data-Augmentation Strategy," Psychometrika, Springer;The Psychometric Society, vol. 84(2), pages 358-374, June.
    6. Martin Burda & Roman Liesenfeld & Jean-Francois Richard, 2008. "Bayesian Analysis of a Probit Panel Data Model with Unobserved Individual Heterogeneity and Autocorrelated Errors," Working Papers tecipa-321, University of Toronto, Department of Economics.
    7. Roman Liesenfeld & Guilherme Valle Moura & Jean‐François Richard, 2010. "Determinants and Dynamics of Current Account Reversals: An Empirical Analysis," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 72(4), pages 486-517, August.
    8. Wladimir Raymond & Pierre Mohnen & Franz Palm & Sybrand Schim van der Loeff, 2007. "The Behavior of the Maximum Likelihood Estimator of Dynamic Panel Data Sample Selection Models," CESifo Working Paper Series 1992, CESifo.
    9. Bauwens, L. & Galli, F., 2009. "Efficient importance sampling for ML estimation of SCD models," Computational Statistics & Data Analysis, Elsevier, vol. 53(6), pages 1974-1992, April.
    10. repec:spo:wpmain:info:hdl:2441/eu4vqp9ompqllr09ij4j0h0h1 is not listed on IDEAS
    11. Mengheng Li & Siem Jan (S.J.) Koopman, 2018. "Unobserved Components with Stochastic Volatility in U.S. Inflation: Estimation and Signal Extraction," Tinbergen Institute Discussion Papers 18-027/III, Tinbergen Institute.
    12. Sanchez-Bueno, Maria J. & Usero, Belen, 2014. "How may the nature of family firms explain the decisions concerning international diversification?," Journal of Business Research, Elsevier, vol. 67(7), pages 1311-1320.
    13. Falk Bräuning & Siem Jan Koopman, 2016. "The dynamic factor network model with an application to global credit risk," Working Papers 16-13, Federal Reserve Bank of Boston.
    14. Chen, Qi & Vashishtha, Rahul, 2017. "The effects of bank mergers on corporate information disclosure," Journal of Accounting and Economics, Elsevier, vol. 64(1), pages 56-77.
    15. Mesters, G. & Koopman, S.J., 2014. "Generalized dynamic panel data models with random effects for cross-section and time," Journal of Econometrics, Elsevier, vol. 180(2), pages 127-140.
    16. Qian, Xuefeng & Tian, Bifei & Reed, W. Robert & Chen, Ziruo, 2018. "Searching for profit-shifting in China," Economics - The Open-Access, Open-Assessment E-Journal (2007-2020), Kiel Institute for the World Economy (IfW Kiel), vol. 12, pages 1-25.
    17. Blazsek, Szabolcs & Escribano, Alvaro, 2010. "Knowledge spillovers in US patents: A dynamic patent intensity model with secret common innovation factors," Journal of Econometrics, Elsevier, vol. 159(1), pages 14-32, November.
    18. Ozturk, Serda Selin & Demirer, Riza & Gupta, Rangan, 2022. "Climate uncertainty and carbon emissions prices: The relative roles of transition and physical climate risks," Economics Letters, Elsevier, vol. 217(C).
    19. Roman Liesenfeld & Guilherme V. Moura & Jean-François Richard & Hariharan Dharmarajan, 2013. "Efficient Likelihood Evaluation of State-Space Representations," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 80(2), pages 538-567.
    20. Steffen R. Henzel & Malte Rengel, 2017. "Dimensions Of Macroeconomic Uncertainty: A Common Factor Analysis," Economic Inquiry, Western Economic Association International, vol. 55(2), pages 843-877, April.
    21. Kunz, J.S.; & Staub, K.E.; & Winkelmann, R.;, 2018. "Predicting fixed effects in panel probit models," Health, Econometrics and Data Group (HEDG) Working Papers 18/23, HEDG, c/o Department of Economics, University of York.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:psycho:v:88:y:2023:i:4:d:10.1007_s11336-022-09888-0. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.