IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v168y2022ics0167947321002462.html
   My bibliography  Save this article

A likelihood-based boosting algorithm for factor analysis models with binary data

Author

Listed:
  • Battauz, Michela
  • Vidoni, Paolo

Abstract

Statistical boosting represents a very effective method for fitting complex models, while performing variable selection and preventing overfitting at the same time. However, the available methods are not directly applicable to factor analysis models for binary data, since any gradient descent method is not able to move from the starting point with zero loadings. The proposed algorithm, exploiting the directions of negative curvature of the log-likelihood function, is able to escape from the regions of local non-convexity. The component-wise approach followed leads to a sparse solution, which has the advantage of facilitating the interpretation without requiring a posterior rotation of the loadings. The method also performs regularization of the estimates, hence reducing their mean square error. To lighten the computational burden of the inferential procedure, a suitable pseudolikelihood, called pairwise likelihood, is exploited. In addition, a group lasso penalty is considered in order to automatically select the number of latent variables included in the model. The good performance of the proposal is illustrated through a simulation study and a real-data example.

Suggested Citation

  • Battauz, Michela & Vidoni, Paolo, 2022. "A likelihood-based boosting algorithm for factor analysis models with binary data," Computational Statistics & Data Analysis, Elsevier, vol. 168(C).
  • Handle: RePEc:eee:csdana:v:168:y:2022:i:c:s0167947321002462
    DOI: 10.1016/j.csda.2021.107412
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947321002462
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2021.107412?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. David Kirk, 1973. "On the numerical approximation of the bivariate normal (tetrachoric) correlation coefficient," Psychometrika, Springer;The Psychometric Society, vol. 38(2), pages 259-268, June.
    2. Eddelbuettel, Dirk & Sanderson, Conrad, 2014. "RcppArmadillo: Accelerating R with high-performance C++ linear algebra," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 1054-1063.
    3. R. Bock & Murray Aitkin, 1981. "Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm," Psychometrika, Springer;The Psychometric Society, vol. 46(4), pages 443-459, December.
    4. Jianan Sun & Yunxiao Chen & Jingchen Liu & Zhiliang Ying & Tao Xin, 2016. "Latent Variable Selection for Multidimensional Item Response Theory Models via $$L_{1}$$ L 1 Regularization," Psychometrika, Springer;The Psychometric Society, vol. 81(4), pages 921-939, December.
    5. Katsikatsou, Myrsini & Moustaki, Irini & Yang-Wallentin, Fan & Jöreskog, Karl G., 2012. "Pairwise likelihood estimation for factor analysis models with ordinal data," LSE Research Online Documents on Economics 43182, London School of Economics and Political Science, LSE Library.
    6. Eddelbuettel, Dirk & Francois, Romain, 2011. "Rcpp: Seamless R and C++ Integration," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 40(i08).
    7. Li Cai, 2010. "High-dimensional Exploratory Item Factor Analysis by A Metropolis–Hastings Robbins–Monro Algorithm," Psychometrika, Springer;The Psychometric Society, vol. 75(1), pages 33-57, March.
    8. A. Béguin & C. Glas, 2001. "MCMC estimation and some model-fit analysis of multidimensional IRT models," Psychometrika, Springer;The Psychometric Society, vol. 66(4), pages 541-561, December.
    9. Cristiano Varin, 2008. "On composite marginal likelihoods," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 92(1), pages 1-28, February.
    10. Cristiano Varin & Paolo Vidoni, 2005. "A note on composite likelihood inference and model selection," Biometrika, Biometrika Trust, vol. 92(3), pages 519-528, September.
    11. Katsikatsou, Myrsini & Moustaki, Irini & Yang-Wallentin, Fan & Jöreskog, Karl G., 2012. "Pairwise likelihood estimation for factor analysis models with ordinal data," Computational Statistics & Data Analysis, Elsevier, vol. 56(12), pages 4243-4258.
    12. Gerhard Tutz & Harald Binder, 2006. "Generalized Additive Modeling with Implicit Variable Selection by Likelihood-Based Boosting," Biometrics, The International Biometric Society, vol. 62(4), pages 961-971, December.
    13. Stephen Schilling & R. Bock, 2005. "High-dimensional maximum marginal likelihood item factor analysis by adaptive quadrature," Psychometrika, Springer;The Psychometric Society, vol. 70(3), pages 533-555, September.
    14. de Leon, A.R., 2005. "Pairwise likelihood approach to grouped continuous model and its extension," Statistics & Probability Letters, Elsevier, vol. 75(1), pages 49-57, November.
    15. Riccardo De Bin, 2016. "Boosting in Cox regression: a comparison between the likelihood-based and the model-based approaches with focus on the R-packages CoxBoost and mboost," Computational Statistics, Springer, vol. 31(2), pages 513-531, June.
    16. Gao, Xin & Song, Peter X.-K., 2010. "Composite Likelihood Bayesian Information Criteria for Model Selection in High-Dimensional Data," Journal of the American Statistical Association, American Statistical Association, vol. 105(492), pages 1531-1540.
    17. Gerhard Tutz & Jan Gertheiss, 2014. "Rating Scales as Predictors—The Old Question of Scale Level and Some Answers," Psychometrika, Springer;The Psychometric Society, vol. 79(3), pages 357-376, July.
    18. Ming Yuan & Yi Lin, 2006. "Model selection and estimation in regression with grouped variables," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(1), pages 49-67, February.
    19. Heidi Seibold & Christoph Bernau & Anne-Laure Boulesteix & Riccardo De Bin, 2018. "On the choice and influence of the number of boosting steps for high-dimensional linear Cox-models," Computational Statistics, Springer, vol. 33(3), pages 1195-1215, September.
    20. Olivares, Alberto & Moguerza, Javier M. & Prieto, Francisco J., 2008. "Nonconvex optimization using negative curvature within a modified linesearch," European Journal of Operational Research, Elsevier, vol. 189(3), pages 706-722, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Myrsini Katsikatsou & Irini Moustaki, 2016. "Pairwise Likelihood Ratio Tests and Model Selection Criteria for Structural Equation Models with Ordinal Variables," Psychometrika, Springer;The Psychometric Society, vol. 81(4), pages 1046-1068, December.
    2. Nuo Xi & Michael W. Browne, 2014. "Contributions to the Underlying Bivariate Normal Method for Factor Analyzing Ordinal Data," Journal of Educational and Behavioral Statistics, , vol. 39(6), pages 583-611, December.
    3. Papageorgiou, Ioulia & Moustaki, Irini, 2019. "Sampling of pairs in pairwise likelihood estimation for latent variable models with categorical observed variables," LSE Research Online Documents on Economics 87592, London School of Economics and Political Science, LSE Library.
    4. Christopher J. Urban & Daniel J. Bauer, 2021. "A Deep Learning Algorithm for High-Dimensional Exploratory Item Factor Analysis," Psychometrika, Springer;The Psychometric Society, vol. 86(1), pages 1-29, March.
    5. Yunxiao Chen & Xiaoou Li & Siliang Zhang, 2019. "Joint Maximum Likelihood Estimation for High-Dimensional Exploratory Item Factor Analysis," Psychometrika, Springer;The Psychometric Society, vol. 84(1), pages 124-146, March.
    6. Zhang, Haoran & Chen, Yunxiao & Li, Xiaoou, 2020. "A note on exploratory item factor analysis by singular value decomposition," LSE Research Online Documents on Economics 104166, London School of Economics and Political Science, LSE Library.
    7. Li Cai, 2010. "Metropolis-Hastings Robbins-Monro Algorithm for Confirmatory Item Factor Analysis," Journal of Educational and Behavioral Statistics, , vol. 35(3), pages 307-335, June.
    8. Haoran Zhang & Yunxiao Chen & Xiaoou Li, 2020. "A Note on Exploratory Item Factor Analysis by Singular Value Decomposition," Psychometrika, Springer;The Psychometric Society, vol. 85(2), pages 358-372, June.
    9. Siliang Zhang & Yunxiao Chen, 2022. "Computation for Latent Variable Model Estimation: A Unified Stochastic Proximal Framework," Psychometrika, Springer;The Psychometric Society, vol. 87(4), pages 1473-1502, December.
    10. Zhang, Siliang & Chen, Yunxiao, 2022. "Computation for latent variable model estimation: a unified stochastic proximal framework," LSE Research Online Documents on Economics 114489, London School of Economics and Political Science, LSE Library.
    11. Katsikatsou, Myrsini & Moustaki, Irini & Yang-Wallentin, Fan & Jöreskog, Karl G., 2012. "Pairwise likelihood estimation for factor analysis models with ordinal data," Computational Statistics & Data Analysis, Elsevier, vol. 56(12), pages 4243-4258.
    12. Monia Ranalli & Roberto Rocci, 2017. "A Model-Based Approach to Simultaneous Clustering and Dimensional Reduction of Ordinal Data," Psychometrika, Springer;The Psychometric Society, vol. 82(4), pages 1007-1034, December.
    13. Vassilis Vasdekis & Silvia Cagnone & Irini Moustaki, 2012. "A Composite Likelihood Inference in Latent Variable Models for Ordinal Longitudinal Responses," Psychometrika, Springer;The Psychometric Society, vol. 77(3), pages 425-441, July.
    14. Björn Andersson & Tao Xin, 2021. "Estimation of Latent Regression Item Response Theory Models Using a Second-Order Laplace Approximation," Journal of Educational and Behavioral Statistics, , vol. 46(2), pages 244-265, April.
    15. Michael Edwards, 2010. "A Markov Chain Monte Carlo Approach to Confirmatory Item Factor Analysis," Psychometrika, Springer;The Psychometric Society, vol. 75(3), pages 474-497, September.
    16. Bhat, Chandra R. & Sener, Ipek N. & Eluru, Naveen, 2010. "A flexible spatially dependent discrete choice model: Formulation and application to teenagers' weekday recreational activity participation," Transportation Research Part B: Methodological, Elsevier, vol. 44(8-9), pages 903-921, September.
    17. Yoav Bergner & Peter Halpin & Jill-Jênn Vie, 2022. "Multidimensional Item Response Theory in the Style of Collaborative Filtering," Psychometrika, Springer;The Psychometric Society, vol. 87(1), pages 266-288, March.
    18. Yang Liu, 2020. "A Riemannian Optimization Algorithm for Joint Maximum Likelihood Estimation of High-Dimensional Exploratory Item Factor Analysis," Psychometrika, Springer;The Psychometric Society, vol. 85(2), pages 439-468, June.
    19. Gregory Camilli & Jean-Paul Fox, 2015. "An Aggregate IRT Procedure for Exploratory Factor Analysis," Journal of Educational and Behavioral Statistics, , vol. 40(4), pages 377-401, August.
    20. Li Cai, 2010. "A Two-Tier Full-Information Item Factor Analysis Model with Applications," Psychometrika, Springer;The Psychometric Society, vol. 75(4), pages 581-612, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:168:y:2022:i:c:s0167947321002462. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.