IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v56y2012i7p2347-2359.html
   My bibliography  Save this article

Robust fitting of mixture regression models

Author

Listed:
  • Bai, Xiuqin
  • Yao, Weixin
  • Boyer, John E.

Abstract

The existing methods for fitting mixture regression models assume a normal distribution for error and then estimate the regression parameters by the maximum likelihood estimate (MLE). In this article, we demonstrate that the MLE, like the least squares estimate, is sensitive to outliers and heavy-tailed error distributions. We propose a robust estimation procedure and an EM-type algorithm to estimate the mixture regression models. Using a Monte Carlo simulation study, we demonstrate that the proposed new estimation method is robust and works much better than the MLE when there are outliers or the error distribution has heavy tails. In addition, the proposed robust method works comparably to the MLE when there are no outliers and the error is normal. A real data application is used to illustrate the success of the proposed robust estimation procedure.

Suggested Citation

  • Bai, Xiuqin & Yao, Weixin & Boyer, John E., 2012. "Robust fitting of mixture regression models," Computational Statistics & Data Analysis, Elsevier, vol. 56(7), pages 2347-2359.
  • Handle: RePEc:eee:csdana:v:56:y:2012:i:7:p:2347-2359
    DOI: 10.1016/j.csda.2012.01.016
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947312000369
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2012.01.016?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Marianthi Markatou, 2000. "Mixture Models, Robustness, and the Weighted Likelihood Methodology," Biometrics, The International Biometric Society, vol. 56(2), pages 483-486, June.
    2. Matthew Stephens, 2000. "Dealing with label switching in mixture models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 62(4), pages 795-809.
    3. L. A. García‐Escudero & A. Gordaliza & R. San Martín & S. Van Aelst & R. Zamar, 2009. "Robust linear clustering," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(1), pages 301-318, January.
    4. Müller, Christine H. & Garlipp, Tim, 2005. "Simple consistent cluster methods based on redescending M-estimators with an application to edge identification in images," Journal of Multivariate Analysis, Elsevier, vol. 92(2), pages 359-385, February.
    5. Hennig, Christian, 2003. "Clusters, outliers, and regression: fixed point clusters," Journal of Multivariate Analysis, Elsevier, vol. 86(1), pages 183-212, July.
    6. García-Escudero, L.A. & Gordaliza, A. & Mayo-Iscar, A. & San Martín, R., 2010. "Robust clusterwise linear regression through trimming," Computational Statistics & Data Analysis, Elsevier, vol. 54(12), pages 3057-3069, December.
    7. Neykov, N. & Filzmoser, P. & Dimova, R. & Neytchev, P., 2007. "Robust fitting of mixtures using the trimmed likelihood estimator," Computational Statistics & Data Analysis, Elsevier, vol. 52(1), pages 299-308, September.
    8. Yao, Weixin & Lindsay, Bruce G., 2009. "Bayesian Mixture Labeling by Highest Posterior Density," Journal of the American Statistical Association, American Statistical Association, vol. 104(486), pages 758-767.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yao, Weixin & Wei, Yan & Yu, Chun, 2014. "Robust mixture regression using the t-distribution," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 116-127.
    2. Gabriele Perrone & Gabriele Soffritti, 2023. "Seemingly unrelated clusterwise linear regression for contaminated data," Statistical Papers, Springer, vol. 64(3), pages 883-921, June.
    3. Gustavo Alexis Sabillón & Luiz Gabriel Fernandes Cotrim & Daiane Aparecida Zuanetti, 2023. "A data-driven reversible jump for estimating a finite mixture of regression models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(1), pages 350-369, March.
    4. Song, Weixing & Yao, Weixin & Xing, Yanru, 2014. "Robust mixture regression model fitting by Laplace distribution," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 128-137.
    5. Luca Greco & Antonio Lucadamo & Claudio Agostinelli, 2021. "Weighted likelihood latent class linear regression," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(2), pages 711-746, June.
    6. Andrea Cerioli & Domenico Perrotta, 2014. "Robust clustering around regression lines with high density regions," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 8(1), pages 5-26, March.
    7. Maruotti, Antonello & Punzo, Antonio, 2017. "Model-based time-varying clustering of multivariate longitudinal data with covariates and outliers," Computational Statistics & Data Analysis, Elsevier, vol. 113(C), pages 475-496.
    8. Li, Xiongya & Bai, Xiuqin & Song, Weixing, 2017. "Robust mixture multivariate linear regression by multivariate Laplace distribution," Statistics & Probability Letters, Elsevier, vol. 130(C), pages 32-39.
    9. Sugasawa, Shonosuke & Kobayashi, Genya, 2022. "Robust fitting of mixture models using weighted complete estimating equations," Computational Statistics & Data Analysis, Elsevier, vol. 174(C).
    10. Inmaculada Martinez‐Zarzoso & Antonello Maruotti, 2013. "The environmental Kuznets curve: functional form, time‐varying heterogeneity and outliers in a panel setting," Environmetrics, John Wiley & Sons, Ltd., vol. 24(7), pages 461-475, November.
    11. P Alquier & M Gerber, 2024. "Universal robust regression via maximum mean discrepancy," Biometrika, Biometrika Trust, vol. 111(1), pages 71-92.
    12. Chun Yu & Weixin Yao & Guangren Yang, 2020. "A Selective Overview and Comparison of Robust Mixture Regression Estimators," International Statistical Review, International Statistical Institute, vol. 88(1), pages 176-202, April.
    13. Angelo Mazza & Antonio Punzo, 2020. "Mixtures of multivariate contaminated normal regression models," Statistical Papers, Springer, vol. 61(2), pages 787-822, April.
    14. Wu, Qiang & Yao, Weixin, 2016. "Mixtures of quantile regressions," Computational Statistics & Data Analysis, Elsevier, vol. 93(C), pages 162-176.
    15. Yuzhu Tian & Manlai Tang & Maozai Tian, 2016. "A class of finite mixture of quantile regressions with its applications," Journal of Applied Statistics, Taylor & Francis Journals, vol. 43(7), pages 1240-1252, July.
    16. Marilena Furno, 2023. "Computing Finite Mixture Estimators in the Tails," Journal of Classification, Springer;The Classification Society, vol. 40(2), pages 267-297, July.
    17. Luca Greco, 2022. "Robust fitting of mixtures of GLMs by weighted likelihood," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 106(1), pages 25-48, March.
    18. Meng Li & Sijia Xiang & Weixin Yao, 2016. "Robust estimation of the number of components for mixtures of linear regression models," Computational Statistics, Springer, vol. 31(4), pages 1539-1555, December.
    19. Sangkon Oh & Byungtae Seo, 2023. "Merging Components in Linear Gaussian Cluster-Weighted Models," Journal of Classification, Springer;The Classification Society, vol. 40(1), pages 25-51, April.
    20. Atefeh Zarei & Zahra Khodadadi & Mohsen Maleki & Karim Zare, 2023. "Robust mixture regression modeling based on two-piece scale mixtures of normal distributions," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(1), pages 181-210, March.
    21. Shi, Jianhong & Chen, Kun & Song, Weixing, 2014. "Robust errors-in-variables linear regression via Laplace distribution," Statistics & Probability Letters, Elsevier, vol. 84(C), pages 113-120.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yao, Weixin & Wei, Yan & Yu, Chun, 2014. "Robust mixture regression using the t-distribution," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 116-127.
    2. Antonio Punzo & Paul. D. McNicholas, 2017. "Robust Clustering in Regression Analysis via the Contaminated Gaussian Cluster-Weighted Model," Journal of Classification, Springer;The Classification Society, vol. 34(2), pages 249-293, July.
    3. Chun Yu & Weixin Yao & Guangren Yang, 2020. "A Selective Overview and Comparison of Robust Mixture Regression Estimators," International Statistical Review, International Statistical Institute, vol. 88(1), pages 176-202, April.
    4. Hu, Hao & Yao, Weixin & Wu, Yichao, 2017. "The robust EM-type algorithms for log-concave mixtures of regression models," Computational Statistics & Data Analysis, Elsevier, vol. 111(C), pages 14-26.
    5. Angelo Mazza & Antonio Punzo, 2020. "Mixtures of multivariate contaminated normal regression models," Statistical Papers, Springer, vol. 61(2), pages 787-822, April.
    6. Luca Greco, 2022. "Robust fitting of mixtures of GLMs by weighted likelihood," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 106(1), pages 25-48, March.
    7. García-Escudero, L.A. & Gordaliza, A. & Mayo-Iscar, A. & San Martín, R., 2010. "Robust clusterwise linear regression through trimming," Computational Statistics & Data Analysis, Elsevier, vol. 54(12), pages 3057-3069, December.
    8. Francesca Torti & Domenico Perrotta & Marco Riani & Andrea Cerioli, 2019. "Assessing trimming methodologies for clustering linear regression data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(1), pages 227-257, March.
    9. Luis García-Escudero & Alfonso Gordaliza & Carlos Matrán & Agustín Mayo-Iscar, 2010. "A review of robust clustering methods," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 4(2), pages 89-109, September.
    10. Luca Greco & Antonio Lucadamo & Claudio Agostinelli, 2021. "Weighted likelihood latent class linear regression," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(2), pages 711-746, June.
    11. Xue, Jiacheng & Yao, Weixin, 2022. "Machine Learning Embedded Semiparametric Mixtures of Regressions with Covariate-Varying Mixing Proportions," Econometrics and Statistics, Elsevier, vol. 22(C), pages 159-171.
    12. Meng Li & Sijia Xiang & Weixin Yao, 2016. "Robust estimation of the number of components for mixtures of linear regression models," Computational Statistics, Springer, vol. 31(4), pages 1539-1555, December.
    13. Chalabi, Yohan / Y. & Wuertz, Diethelm, 2010. "Weighted trimmed likelihood estimator for GARCH models," MPRA Paper 26536, University Library of Munich, Germany.
    14. Lu, Xiaosun & Huang, Yangxin & Zhu, Yiliang, 2016. "Finite mixture of nonlinear mixed-effects joint models in the presence of missing and mismeasured covariate, with application to AIDS studies," Computational Statistics & Data Analysis, Elsevier, vol. 93(C), pages 119-130.
    15. Murray, Paula M. & Browne, Ryan P. & McNicholas, Paul D., 2017. "Hidden truncation hyperbolic distributions, finite mixtures thereof, and their application for clustering," Journal of Multivariate Analysis, Elsevier, vol. 161(C), pages 141-156.
    16. Neykov, N. & Filzmoser, P. & Dimova, R. & Neytchev, P., 2007. "Robust fitting of mixtures using the trimmed likelihood estimator," Computational Statistics & Data Analysis, Elsevier, vol. 52(1), pages 299-308, September.
    17. Jia-Chiun Pan & Guan-Hua Huang, 2014. "Bayesian Inferences of Latent Class Models with an Unknown Number of Classes," Psychometrika, Springer;The Psychometric Society, vol. 79(4), pages 621-646, October.
    18. Andrea Cerioli & Domenico Perrotta, 2014. "Robust clustering around regression lines with high density regions," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 8(1), pages 5-26, March.
    19. Sijia Xiang & Weixin Yao, 2020. "Semiparametric mixtures of regressions with single-index for model based clustering," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(2), pages 261-292, June.
    20. Sijia Xiang & Weixin Yao, 2018. "Semiparametric mixtures of nonparametric regressions," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 70(1), pages 131-154, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:56:y:2012:i:7:p:2347-2359. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.