IDEAS home Printed from https://ideas.repec.org/a/eee/ecosta/v22y2022icp159-171.html
   My bibliography  Save this article

Machine Learning Embedded Semiparametric Mixtures of Regressions with Covariate-Varying Mixing Proportions

Author

Listed:
  • Xue, Jiacheng
  • Yao, Weixin

Abstract

A new class of semiparametric mixture regression models with covariate-varying mixing proportions is introduced by embedding machine learning methods into mixtures of regressions. The new method uses the neural network to estimate mixing proportions nonparametrically while using the maximum likelihood estimate to estimate all other component parameters. The new machine learning embedded semiparametric mixture regression models offer more flexible estimation compared to traditional parametric mixture regression models. More importantly, the new hybrid method could better estimate the effects of multivariate covariates nonparametrically than the traditional kernel regression methods, which suffer from the well known “curse of dimensionality”. The introduced hybrid idea can be easily extended to other semiparametric statistical models and other machine learning methods. Simulation studies and a real data application are used to demonstrate the effectiveness of the proposed new method and compare it with some other existing methods.

Suggested Citation

  • Xue, Jiacheng & Yao, Weixin, 2022. "Machine Learning Embedded Semiparametric Mixtures of Regressions with Covariate-Varying Mixing Proportions," Econometrics and Statistics, Elsevier, vol. 22(C), pages 159-171.
  • Handle: RePEc:eee:ecosta:v:22:y:2022:i:c:p:159-171
    DOI: 10.1016/j.ecosta.2021.10.018
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S2452306221001453
    Download Restriction: Full text for ScienceDirect subscribers only. Contains open access articles

    File URL: https://libkey.io/10.1016/j.ecosta.2021.10.018?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Benaglia, Tatiana & Chauveau, Didier & Hunter, David R. & Young, Derek S., 2009. "mixtools: An R Package for Analyzing Mixture Models," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 32(i06).
    2. Young, D.S. & Hunter, D.R., 2010. "Mixtures of regressions with predictor-dependent mixing proportions," Computational Statistics & Data Analysis, Elsevier, vol. 54(10), pages 2253-2266, October.
    3. Sijia Xiang & Weixin Yao, 2020. "Semiparametric mixtures of regressions with single-index for model based clustering," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(2), pages 261-292, June.
    4. Wang, Shaoli & Yao, Weixin & Huang, Mian, 2014. "A note on the identifiability of nonparametric and semiparametric mixtures of GLMs," Statistics & Probability Letters, Elsevier, vol. 93(C), pages 41-45.
    5. Sijia Xiang & Weixin Yao, 2018. "Semiparametric mixtures of nonparametric regressions," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 70(1), pages 131-154, February.
    6. Matthew Stephens, 2000. "Dealing with label switching in mixture models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 62(4), pages 795-809.
    7. Mian Huang & Weixin Yao, 2012. "Mixture of Regression Models With Varying Mixing Proportions: A Semiparametric Approach," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(498), pages 711-724, June.
    8. Mian Huang & Runze Li & Hansheng Wang & Weixin Yao, 2014. "Estimating Mixture of Gaussian Processes by Kernel Smoothing," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 32(2), pages 259-270, April.
    9. Harrison, David Jr. & Rubinfeld, Daniel L., 1978. "Hedonic housing prices and the demand for clean air," Journal of Environmental Economics and Management, Elsevier, vol. 5(1), pages 81-102, March.
    10. Goldfeld, Stephen M. & Quandt, Richard E., 1973. "A Markov model for switching regressions," Journal of Econometrics, Elsevier, vol. 1(1), pages 3-15, March.
    11. Mian Huang & Weixin Yao & Shaoli Wang & Yixin Chen, 2018. "Statistical Inference and Applications of Mixture of Varying Coefficient Models," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 45(3), pages 618-643, September.
    12. Yao, Weixin & Lindsay, Bruce G., 2009. "Bayesian Mixture Labeling by Highest Posterior Density," Journal of the American Statistical Association, American Statistical Association, vol. 104(486), pages 758-767.
    13. Mirfarah, Elham & Naderi, Mehrdad & Chen, Ding-Geng, 2021. "Mixture of linear experts model for censored data: A novel approach with scale-mixture of normal distributions," Computational Statistics & Data Analysis, Elsevier, vol. 158(C).
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Sphiwe B. Skhosana & Salomon M. Millard & Frans H. J. Kanfer, 2023. "A Novel EM-Type Algorithm to Estimate Semi-Parametric Mixtures of Partially Linear Models," Mathematics, MDPI, vol. 11(5), pages 1-20, February.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sijia Xiang & Weixin Yao, 2020. "Semiparametric mixtures of regressions with single-index for model based clustering," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(2), pages 261-292, June.
    2. Sijia Xiang & Weixin Yao, 2018. "Semiparametric mixtures of nonparametric regressions," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 70(1), pages 131-154, February.
    3. Yao, Weixin & Wei, Yan & Yu, Chun, 2014. "Robust mixture regression using the t-distribution," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 116-127.
    4. Sphiwe B. Skhosana & Salomon M. Millard & Frans H. J. Kanfer, 2023. "A Novel EM-Type Algorithm to Estimate Semi-Parametric Mixtures of Partially Linear Models," Mathematics, MDPI, vol. 11(5), pages 1-20, February.
    5. Lu, Xiaosun & Huang, Yangxin & Zhu, Yiliang, 2016. "Finite mixture of nonlinear mixed-effects joint models in the presence of missing and mismeasured covariate, with application to AIDS studies," Computational Statistics & Data Analysis, Elsevier, vol. 93(C), pages 119-130.
    6. Hu, Hao & Yao, Weixin & Wu, Yichao, 2017. "The robust EM-type algorithms for log-concave mixtures of regression models," Computational Statistics & Data Analysis, Elsevier, vol. 111(C), pages 14-26.
    7. Yanyuan Ma & Shaoli Wang & Lin Xu & Weixin Yao, 2021. "Semiparametric mixture regression with unspecified error distributions," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(2), pages 429-444, June.
    8. Hu, Hao & Wu, Yichao & Yao, Weixin, 2016. "Maximum likelihood estimation of the mixture of log-concave densities," Computational Statistics & Data Analysis, Elsevier, vol. 101(C), pages 137-147.
    9. Marco Berrettini & Giuliano Galimberti & Saverio Ranciati, 2023. "Semiparametric finite mixture of regression models with Bayesian P-splines," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(3), pages 745-775, September.
    10. Chun Yu & Weixin Yao & Guangren Yang, 2020. "A Selective Overview and Comparison of Robust Mixture Regression Estimators," International Statistical Review, International Statistical Institute, vol. 88(1), pages 176-202, April.
    11. Naderi, Mehrdad & Mirfarah, Elham & Wang, Wan-Lun & Lin, Tsung-I, 2023. "Robust mixture regression modeling based on the normal mean-variance mixture distributions," Computational Statistics & Data Analysis, Elsevier, vol. 180(C).
    12. Wan-Lun Wang, 2019. "Mixture of multivariate t nonlinear mixed models for multiple longitudinal data with heterogeneity and missing values," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(1), pages 196-222, March.
    13. Ye, Mao & Lu, Zhao-Hua & Li, Yimei & Song, Xinyuan, 2019. "Finite mixture of varying coefficient model: Estimation and component selection," Journal of Multivariate Analysis, Elsevier, vol. 171(C), pages 452-474.
    14. Meng Li & Sijia Xiang & Weixin Yao, 2016. "Robust estimation of the number of components for mixtures of linear regression models," Computational Statistics, Springer, vol. 31(4), pages 1539-1555, December.
    15. Murray, Paula M. & Browne, Ryan P. & McNicholas, Paul D., 2017. "Hidden truncation hyperbolic distributions, finite mixtures thereof, and their application for clustering," Journal of Multivariate Analysis, Elsevier, vol. 161(C), pages 141-156.
    16. Bai, Xiuqin & Yao, Weixin & Boyer, John E., 2012. "Robust fitting of mixture regression models," Computational Statistics & Data Analysis, Elsevier, vol. 56(7), pages 2347-2359.
    17. Jia-Chiun Pan & Guan-Hua Huang, 2014. "Bayesian Inferences of Latent Class Models with an Unknown Number of Classes," Psychometrika, Springer;The Psychometric Society, vol. 79(4), pages 621-646, October.
    18. Gustavo Alexis Sabillón & Luiz Gabriel Fernandes Cotrim & Daiane Aparecida Zuanetti, 2023. "A data-driven reversible jump for estimating a finite mixture of regression models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(1), pages 350-369, March.
    19. Aßmann, Christian & Boysen-Hogrefe, Jens, 2011. "A Bayesian approach to model-based clustering for binary panel probit models," Computational Statistics & Data Analysis, Elsevier, vol. 55(1), pages 261-279, January.
    20. Antonio Punzo & Paul. D. McNicholas, 2017. "Robust Clustering in Regression Analysis via the Contaminated Gaussian Cluster-Weighted Model," Journal of Classification, Springer;The Classification Society, vol. 34(2), pages 249-293, July.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ecosta:v:22:y:2022:i:c:p:159-171. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: https://www.journals.elsevier.com/econometrics-and-statistics .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.