IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v158y2021ics0167947321000141.html
   My bibliography  Save this article

Variable selection in finite mixture of regression models with an unknown number of components

Author

Listed:
  • Lee, Kuo-Jung
  • Feldkircher, Martin
  • Chen, Yi-Chi

Abstract

A Bayesian framework for finite mixture models to deal with model selection and the selection of the number of mixture components simultaneously is presented. For that purpose, a feasible reversible jump Markov Chain Monte Carlo algorithm is proposed to model each component as a sparse regression model. This approach is made robust to outliers by using a prior that induces heavy tails and works well under multicollinearity and with high-dimensional data. Finally, the framework is applied to cross-sectional data investigating early warning indicators. The results reveal two distinct country groups for which estimated effects of vulnerability indicators vary considerably.

Suggested Citation

  • Lee, Kuo-Jung & Feldkircher, Martin & Chen, Yi-Chi, 2021. "Variable selection in finite mixture of regression models with an unknown number of components," Computational Statistics & Data Analysis, Elsevier, vol. 158(C).
  • Handle: RePEc:eee:csdana:v:158:y:2021:i:c:s0167947321000141
    DOI: 10.1016/j.csda.2021.107180
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947321000141
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2021.107180?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Berkmen, S. Pelin & Gelos, Gaston & Rennhack, Robert & Walsh, James P., 2012. "The global financial crisis: Explaining cross-country differences in the output impact," Journal of International Money and Finance, Elsevier, vol. 31(1), pages 42-59.
    2. Feldkircher, Martin, 2014. "The determinants of vulnerability to the global financial crisis 2008 to 2009: Credit growth and other sources of risk," Journal of International Money and Finance, Elsevier, vol. 43(C), pages 19-49.
    3. Cozzini, Alberto & Jasra, Ajay & Montana, Giovanni & Persing, Adam, 2014. "A Bayesian mixture of lasso regressions with t-errors," Computational Statistics & Data Analysis, Elsevier, vol. 77(C), pages 84-97.
    4. Matthew Stephens, 2000. "Dealing with label switching in mixture models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 62(4), pages 795-809.
    5. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    6. Gupta, Mayetri & Ibrahim, Joseph G., 2007. "Variable Selection in Regression Mixture Modeling for the Discovery of Gene Regulatory Networks," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 867-880, September.
    7. Abbas Khalili & Shili Lin, 2013. "Regularization in Finite Mixture of Regression Models with Diverging Number of Parameters," Biometrics, The International Biometric Society, vol. 69(2), pages 436-446, June.
    8. Jan-Michael Becker & Christian Ringle & Marko Sarstedt & Franziska Völckner, 2015. "How collinearity affects mixture regression results," Marketing Letters, Springer, vol. 26(4), pages 643-659, December.
    9. Geweke, J, 1993. "Bayesian Treatment of the Independent Student- t Linear Model," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 8(S), pages 19-40, Suppl. De.
    10. Jesús Crespo Cuaresma & Martin Feldkircher, 2012. "Drivers of Output Loss during the 2008–09 Crisis: A Focus on Emerging Europe," Focus on European Economic Integration, Oesterreichische Nationalbank (Austrian Central Bank), issue 2, pages 46-64.
    11. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    12. Sylvia. Richardson & Peter J. Green, 1997. "On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion)," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 59(4), pages 731-792.
    13. Martin Feldkircher & Thomas Gruber & Isabella Moder, 2014. "Using a Threshold Approach to Flag Vulnerabilities in CESEE Economies," Focus on European Economic Integration, Oesterreichische Nationalbank (Austrian Central Bank), issue 3, pages 8-30.
    14. Sala-i-Martin, Xavier, 1997. "I Just Ran Two Million Regressions," American Economic Review, American Economic Association, vol. 87(2), pages 178-183, May.
    15. Joyee Ghosh & Andrew E. Ghattas, 2015. "Bayesian Variable Selection Under Collinearity," The American Statistician, Taylor & Francis Journals, vol. 69(3), pages 165-173, August.
    16. Dunson, David B. & Herring, Amy H. & Engel, Stephanie M., 2008. "Bayesian Selection and Clustering of Polymorphisms in Functionally Related Genes," Journal of the American Statistical Association, American Statistical Association, vol. 103, pages 534-546, June.
    17. Li, Pengfei & Chen, Jiahua, 2010. "Testing the Order of a Finite Mixture," Journal of the American Statistical Association, American Statistical Association, vol. 105(491), pages 1084-1092.
    18. Jian Zhang, 2017. "Screening and clustering of sparse regressions with finite non-Gaussian mixtures," Biometrics, The International Biometric Society, vol. 73(2), pages 540-550, June.
    19. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    20. Tadesse, Mahlet G. & Sha, Naijun & Vannucci, Marina, 2005. "Bayesian Variable Selection in Clustering High-Dimensional Data," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 602-617, June.
    21. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    22. Khalili, Abbas & Chen, Jiahua, 2007. "Variable Selection in Finite Mixture of Regression Models," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 1025-1038, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Qingguo Tang & R. J. Karunamuni, 2018. "Robust variable selection for finite mixture regression models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 70(3), pages 489-521, June.
    2. Yan Li & Chun Yu & Yize Zhao & Weixin Yao & Robert H. Aseltine & Kun Chen, 2022. "Pursuing sources of heterogeneity in modeling clustered population," Biometrics, The International Biometric Society, vol. 78(2), pages 716-729, June.
    3. Howard D. Bondell & Brian J. Reich, 2012. "Consistent High-Dimensional Bayesian Variable Selection via Penalized Credible Regions," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(500), pages 1610-1624, December.
    4. Mark F. J. Steel, 2020. "Model Averaging and Its Use in Economics," Journal of Economic Literature, American Economic Association, vol. 58(3), pages 644-719, September.
    5. Schneider Ulrike & Wagner Martin, 2012. "Catching Growth Determinants with the Adaptive Lasso," German Economic Review, De Gruyter, vol. 13(1), pages 71-85, February.
    6. Ping Zeng & Yongyue Wei & Yang Zhao & Jin Liu & Liya Liu & Ruyang Zhang & Jianwei Gou & Shuiping Huang & Feng Chen, 2014. "Variable selection approach for zero-inflated count data via adaptive lasso," Journal of Applied Statistics, Taylor & Francis Journals, vol. 41(4), pages 879-894, April.
    7. Posch, Konstantin & Arbeiter, Maximilian & Pilz, Juergen, 2020. "A novel Bayesian approach for variable selection in linear regression models," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
    8. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    9. Margherita Giuzio, 2017. "Genetic algorithm versus classical methods in sparse index tracking," Decisions in Economics and Finance, Springer;Associazione per la Matematica, vol. 40(1), pages 243-256, November.
    10. Yize Zhao & Matthias Chung & Brent A. Johnson & Carlos S. Moreno & Qi Long, 2016. "Hierarchical Feature Selection Incorporating Known and Novel Biological Information: Identifying Genomic Features Related to Prostate Cancer Recurrence," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1427-1439, October.
    11. Gareth M. James & Peter Radchenko & Jinchi Lv, 2009. "DASSO: connections between the Dantzig selector and lasso," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(1), pages 127-142, January.
    12. Umberto Amato & Anestis Antoniadis & Italia De Feis & Irene Gijbels, 2021. "Penalised robust estimators for sparse and high-dimensional linear models," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(1), pages 1-48, March.
    13. Camila Epprecht & Dominique Guegan & Álvaro Veiga & Joel Correa da Rosa, 2017. "Variable selection and forecasting via automated methods for linear models: LASSO/adaLASSO and Autometrics," Post-Print halshs-00917797, HAL.
    14. Wang, Christina Dan & Chen, Zhao & Lian, Yimin & Chen, Min, 2022. "Asset selection based on high frequency Sharpe ratio," Journal of Econometrics, Elsevier, vol. 227(1), pages 168-188.
    15. Peter Bühlmann & Jacopo Mandozzi, 2014. "High-dimensional variable screening and bias in subsequent inference, with an empirical comparison," Computational Statistics, Springer, vol. 29(3), pages 407-430, June.
    16. Peter Martey Addo & Dominique Guegan & Bertrand Hassani, 2018. "Credit Risk Analysis Using Machine and Deep Learning Models," Risks, MDPI, vol. 6(2), pages 1-20, April.
    17. Capanu, Marinela & Giurcanu, Mihai & Begg, Colin B. & Gönen, Mithat, 2023. "Subsampling based variable selection for generalized linear models," Computational Statistics & Data Analysis, Elsevier, vol. 184(C).
    18. Weng, Jiaying, 2022. "Fourier transform sparse inverse regression estimators for sufficient variable selection," Computational Statistics & Data Analysis, Elsevier, vol. 168(C).
    19. Ander Wilson & Brian J. Reich, 2014. "Confounder selection via penalized credible regions," Biometrics, The International Biometric Society, vol. 70(4), pages 852-861, December.
    20. Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:158:y:2021:i:c:s0167947321000141. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.