IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v11y2023i5p1087-d1076407.html
   My bibliography  Save this article

A Novel EM-Type Algorithm to Estimate Semi-Parametric Mixtures of Partially Linear Models

Author

Listed:
  • Sphiwe B. Skhosana

    (Department of Statistics, University of Pretoria, Pretoria 0002, South Africa)

  • Salomon M. Millard

    (Department of Statistics, University of Pretoria, Pretoria 0002, South Africa)

  • Frans H. J. Kanfer

    (Department of Statistics, University of Pretoria, Pretoria 0002, South Africa)

Abstract

Semi- and non-parametric mixture of normal regression models are a flexible class of mixture of regression models. These models assume that the component mixing proportions, regression functions and/or variances are non-parametric functions of the covariates. Among this class of models, the semi-parametric mixture of partially linear models (SPMPLMs) combine the desirable interpretability of a parametric model and the flexibility of a non-parametric model. However, local-likelihood estimation of the non-parametric term poses a computational challenge. Traditional EM optimisation of the local-likelihood functions is not appropriate due to the label-switching problem. Separately applying the EM algorithm on each local-likelihood function will likely result in non-smooth function estimates. This is because the local responsibilities calculated at the E-step of each local EM are not guaranteed to be aligned. To prevent this, the EM algorithm must be modified so that the same (global) responsibilities are used at each local M-step. In this paper, we propose a one-step backfitting EM-type algorithm to estimate the SPMPLMs and effectively address the label-switching problem. The proposed algorithm estimates the non-parametric term using each set of local responsibilities in turn and then incorporates a smoothing step to obtain the smoothest estimate. In addition, to reduce the computational burden imposed by the use of the partial-residuals estimator of the parametric term, we propose a plug-in estimator. The performance and practical usefulness of the proposed methods was tested using a simulated dataset and two real datasets, respectively. Our finite sample analysis revealed that the proposed methods are effective at solving the label-switching problem and producing reasonable and interpretable results in a reasonable amount of time.

Suggested Citation

  • Sphiwe B. Skhosana & Salomon M. Millard & Frans H. J. Kanfer, 2023. "A Novel EM-Type Algorithm to Estimate Semi-Parametric Mixtures of Partially Linear Models," Mathematics, MDPI, vol. 11(5), pages 1-20, February.
  • Handle: RePEc:gam:jmathe:v:11:y:2023:i:5:p:1087-:d:1076407
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/11/5/1087/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/11/5/1087/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Matthew Stephens, 2000. "Dealing with label switching in mixture models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 62(4), pages 795-809.
    2. Mian Huang & Weixin Yao, 2012. "Mixture of Regression Models With Varying Mixing Proportions: A Semiparametric Approach," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(498), pages 711-724, June.
    3. Yi Zhang & Qingle Zheng, 2018. "Semiparametric mixture of additive regression models," Communications in Statistics - Theory and Methods, Taylor & Francis Journals, vol. 47(3), pages 681-697, February.
    4. Mian Huang & Weixin Yao & Shaoli Wang & Yixin Chen, 2018. "Statistical Inference and Applications of Mixture of Varying Coefficient Models," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 45(3), pages 618-643, September.
    5. Yi Zhang & Weiquan Pan, 2022. "Estimation and inference for mixture of partially linear additive models," Communications in Statistics - Theory and Methods, Taylor & Francis Journals, vol. 51(8), pages 2519-2533, April.
    6. De Veaux, Richard D., 1989. "Mixtures of linear regressions," Computational Statistics & Data Analysis, Elsevier, vol. 8(3), pages 227-245, November.
    7. Sijia Xiang & Weixin Yao, 2020. "Semiparametric mixtures of regressions with single-index for model based clustering," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(2), pages 261-292, June.
    8. Wayne DeSarbo & William Cron, 1988. "A maximum likelihood methodology for clusterwise linear regression," Journal of Classification, Springer;The Classification Society, vol. 5(2), pages 249-282, September.
    9. Goldfeld, Stephen M. & Quandt, Richard E., 1973. "A Markov model for switching regressions," Journal of Econometrics, Elsevier, vol. 1(1), pages 3-15, March.
    10. Mian Huang & Runze Li & Shaoli Wang, 2013. "Nonparametric Mixture of Regression Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(503), pages 929-941, September.
    11. Dinda, Soumyananda, 2004. "Environmental Kuznets Curve Hypothesis: A Survey," Ecological Economics, Elsevier, vol. 49(4), pages 431-455, August.
    12. Xue, Jiacheng & Yao, Weixin, 2022. "Machine Learning Embedded Semiparametric Mixtures of Regressions with Covariate-Varying Mixing Proportions," Econometrics and Statistics, Elsevier, vol. 22(C), pages 159-171.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xue, Jiacheng & Yao, Weixin, 2022. "Machine Learning Embedded Semiparametric Mixtures of Regressions with Covariate-Varying Mixing Proportions," Econometrics and Statistics, Elsevier, vol. 22(C), pages 159-171.
    2. Sijia Xiang & Weixin Yao, 2018. "Semiparametric mixtures of nonparametric regressions," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 70(1), pages 131-154, February.
    3. Yao, Weixin & Wei, Yan & Yu, Chun, 2014. "Robust mixture regression using the t-distribution," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 116-127.
    4. Ye, Mao & Lu, Zhao-Hua & Li, Yimei & Song, Xinyuan, 2019. "Finite mixture of varying coefficient model: Estimation and component selection," Journal of Multivariate Analysis, Elsevier, vol. 171(C), pages 452-474.
    5. Sijia Xiang & Weixin Yao, 2020. "Semiparametric mixtures of regressions with single-index for model based clustering," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(2), pages 261-292, June.
    6. Yanyuan Ma & Shaoli Wang & Lin Xu & Weixin Yao, 2021. "Semiparametric mixture regression with unspecified error distributions," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(2), pages 429-444, June.
    7. Marco Berrettini & Giuliano Galimberti & Saverio Ranciati, 2023. "Semiparametric finite mixture of regression models with Bayesian P-splines," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(3), pages 745-775, September.
    8. Wan-Lun Wang, 2019. "Mixture of multivariate t nonlinear mixed models for multiple longitudinal data with heterogeneity and missing values," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(1), pages 196-222, March.
    9. Abbas Khalili & Farhad Shokoohi & Masoud Asgharian & Shili Lin, 2023. "Sparse estimation in semiparametric finite mixture of varying coefficient regression models," Biometrics, The International Biometric Society, vol. 79(4), pages 3445-3457, December.
    10. Lu, Xiaosun & Huang, Yangxin & Zhu, Yiliang, 2016. "Finite mixture of nonlinear mixed-effects joint models in the presence of missing and mismeasured covariate, with application to AIDS studies," Computational Statistics & Data Analysis, Elsevier, vol. 93(C), pages 119-130.
    11. Hoshino Tadao & Yanagi Takahide, 2022. "Estimating marginal treatment effects under unobserved group heterogeneity," Journal of Causal Inference, De Gruyter, vol. 10(1), pages 197-216, January.
    12. Ahonen, Ilmari & Nevalainen, Jaakko & Larocque, Denis, 2019. "Prediction with a flexible finite mixture-of-regressions," Computational Statistics & Data Analysis, Elsevier, vol. 132(C), pages 212-224.
    13. De la Cruz-Mesia, Rolando & Quintana, Fernando A. & Marshall, Guillermo, 2008. "Model-based clustering for longitudinal data," Computational Statistics & Data Analysis, Elsevier, vol. 52(3), pages 1441-1457, January.
    14. Duncan Fong & Wayne DeSarbo, 2007. "A Bayesian methodology for simultaneously detecting and estimating regime change points and variable selection in multiple regression models for marketing research," Quantitative Marketing and Economics (QME), Springer, vol. 5(4), pages 427-453, December.
    15. Martínez-Zarzoso, Inmaculada & Maruotti, Antonello, 2011. "The impact of urbanization on CO2 emissions: Evidence from developing countries," Ecological Economics, Elsevier, vol. 70(7), pages 1344-1353, May.
    16. Gabriele Perrone & Gabriele Soffritti, 2023. "Seemingly unrelated clusterwise linear regression for contaminated data," Statistical Papers, Springer, vol. 64(3), pages 883-921, June.
    17. Lloyd-Jones, Luke R. & Nguyen, Hien D. & McLachlan, Geoffrey J., 2018. "A globally convergent algorithm for lasso-penalized mixture of linear regression models," Computational Statistics & Data Analysis, Elsevier, vol. 119(C), pages 19-38.
    18. Gustavo Alexis Sabillón & Luiz Gabriel Fernandes Cotrim & Daiane Aparecida Zuanetti, 2023. "A data-driven reversible jump for estimating a finite mixture of regression models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(1), pages 350-369, March.
    19. Xiaotian Zhu & David R. Hunter, 2019. "Clustering via finite nonparametric ICA mixture models," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(1), pages 65-87, March.
    20. Antonio Punzo & Paul. D. McNicholas, 2017. "Robust Clustering in Regression Analysis via the Contaminated Gaussian Cluster-Weighted Model," Journal of Classification, Springer;The Classification Society, vol. 34(2), pages 249-293, July.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:11:y:2023:i:5:p:1087-:d:1076407. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.