IDEAS home Printed from https://ideas.repec.org/a/spr/advdac/v17y2023i3d10.1007_s11634-022-00523-5.html
   My bibliography  Save this article

Semiparametric finite mixture of regression models with Bayesian P-splines

Author

Listed:
  • Marco Berrettini

    (University of Bologna)

  • Giuliano Galimberti

    (University of Bologna)

  • Saverio Ranciati

    (University of Bologna)

Abstract

Mixture models provide a useful tool to account for unobserved heterogeneity and are at the basis of many model-based clustering methods. To gain additional flexibility, some model parameters can be expressed as functions of concomitant covariates. In this Paper, a semiparametric finite mixture of regression models is defined, with concomitant information assumed to influence both the component weights and the conditional means. In particular, linear predictors are replaced with smooth functions of the covariate considered by resorting to cubic splines. An estimation procedure within the Bayesian paradigm is suggested, where smoothness of the covariate effects is controlled by suitable choices for the prior distributions of the spline coefficients. A data augmentation scheme based on difference random utility models is exploited to describe the mixture weights as functions of the covariate. The performance of the proposed methodology is investigated via simulation experiments and two real-world datasets, one about baseball salaries and the other concerning nitrogen oxide in engine exhaust.

Suggested Citation

  • Marco Berrettini & Giuliano Galimberti & Saverio Ranciati, 2023. "Semiparametric finite mixture of regression models with Bayesian P-splines," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(3), pages 745-775, September.
  • Handle: RePEc:spr:advdac:v:17:y:2023:i:3:d:10.1007_s11634-022-00523-5
    DOI: 10.1007/s11634-022-00523-5
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11634-022-00523-5
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11634-022-00523-5?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Bitto, Angela & Frühwirth-Schnatter, Sylvia, 2019. "Achieving shrinkage in a time-varying parameter model framework," Journal of Econometrics, Elsevier, vol. 210(1), pages 75-97.
    2. Villani, Mattias & Kohn, Robert & Giordani, Paolo, 2009. "Regression density estimation using smooth adaptive Gaussian mixtures," Journal of Econometrics, Elsevier, vol. 153(2), pages 155-173, December.
    3. Young, D.S. & Hunter, D.R., 2010. "Mixtures of regressions with predictor-dependent mixing proportions," Computational Statistics & Data Analysis, Elsevier, vol. 54(10), pages 2253-2266, October.
    4. Sylvia. Richardson & Peter J. Green, 1997. "On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion)," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 59(4), pages 731-792.
    5. Annalisa Cadonna & Sylvia Frühwirth-Schnatter & Peter Knaus, 2020. "Triple the Gamma—A Unifying Shrinkage Prior for Variance and Variable Selection in Sparse State Space and TVP Models," Econometrics, MDPI, vol. 8(2), pages 1-36, May.
    6. Sylvia Frühwirth‐Schnatter & Christoph Pamminger & Andrea Weber & Rudolf Winter‐Ebmer, 2012. "Labor market entry and earnings dynamics: Bayesian inference using mixtures‐of‐experts Markov chain clustering," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 27(7), pages 1116-1137, November.
    7. Abby Flynt & Nema Dean & Rebecca Nugent, 2019. "sARI: a soft agreement measure for class partitions incorporating assignment probabilities," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(1), pages 303-323, March.
    8. T. Rolf Turner, 2000. "Estimating the propagation rate of a viral infection of potato plants via mixtures of regressions," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 49(3), pages 371-384.
    9. Geweke, John & Keane, Michael, 2007. "Smoothly mixing regressions," Journal of Econometrics, Elsevier, vol. 138(1), pages 252-290, May.
    10. Cristina Mollica & Luca Tardella, 2017. "Bayesian Plackett–Luce Mixture Models for Partially Ranked Data," Psychometrika, Springer;The Psychometric Society, vol. 82(2), pages 442-458, June.
    11. Sijia Xiang & Weixin Yao, 2018. "Semiparametric mixtures of nonparametric regressions," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 70(1), pages 131-154, February.
    12. David J. Spiegelhalter & Nicola G. Best & Bradley P. Carlin & Angelika Van Der Linde, 2002. "Bayesian measures of model complexity and fit," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 64(4), pages 583-639, October.
    13. Mian Huang & Weixin Yao, 2012. "Mixture of Regression Models With Varying Mixing Proportions: A Semiparametric Approach," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(498), pages 711-724, June.
    14. Redivo, Edoardo & Nguyen, Hien D. & Gupta, Mayetri, 2020. "Bayesian clustering of skewed and multimodal data using geometric skewed normal distributions," Computational Statistics & Data Analysis, Elsevier, vol. 152(C).
    15. Lawrence Hubert & Phipps Arabie, 1985. "Comparing partitions," Journal of Classification, Springer;The Classification Society, vol. 2(1), pages 193-218, December.
    16. Mian Huang & Runze Li & Shaoli Wang, 2013. "Nonparametric Mixture of Regression Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(503), pages 929-941, September.
    17. Yi Zhang & Qingle Zheng, 2018. "Semiparametric mixture of additive regression models," Communications in Statistics - Theory and Methods, Taylor & Francis Journals, vol. 47(3), pages 681-697, February.
    18. Salvatore Ingrassia & Simona Minotti & Giorgio Vittadini, 2012. "Local Statistical Modeling via a Cluster-Weighted Approach with Elliptical Distributions," Journal of Classification, Springer;The Classification Society, vol. 29(3), pages 363-401, October.
    19. Adam Tashman & Robert Frey, 2009. "Modeling risk in arbitrage strategies using finite mixtures," Quantitative Finance, Taylor & Francis Journals, vol. 9(5), pages 495-503.
    20. Brezger, Andreas & Lang, Stefan, 2006. "Generalized structured additive regression based on Bayesian P-splines," Computational Statistics & Data Analysis, Elsevier, vol. 50(4), pages 967-991, February.
    21. Khalili, Abbas & Chen, Jiahua, 2007. "Variable Selection in Finite Mixture of Regression Models," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 1025-1038, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sijia Xiang & Weixin Yao, 2020. "Semiparametric mixtures of regressions with single-index for model based clustering," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(2), pages 261-292, June.
    2. Keefe Murphy & Thomas Brendan Murphy, 2020. "Gaussian parsimonious clustering models with covariates and a noise component," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(2), pages 293-325, June.
    3. Li, Feng & Kang, Yanfei, 2018. "Improving forecasting performance using covariate-dependent copula models," International Journal of Forecasting, Elsevier, vol. 34(3), pages 456-476.
    4. Ye, Mao & Lu, Zhao-Hua & Li, Yimei & Song, Xinyuan, 2019. "Finite mixture of varying coefficient model: Estimation and component selection," Journal of Multivariate Analysis, Elsevier, vol. 171(C), pages 452-474.
    5. Sphiwe B. Skhosana & Salomon M. Millard & Frans H. J. Kanfer, 2023. "A Novel EM-Type Algorithm to Estimate Semi-Parametric Mixtures of Partially Linear Models," Mathematics, MDPI, vol. 11(5), pages 1-20, February.
    6. Xue, Jiacheng & Yao, Weixin, 2022. "Machine Learning Embedded Semiparametric Mixtures of Regressions with Covariate-Varying Mixing Proportions," Econometrics and Statistics, Elsevier, vol. 22(C), pages 159-171.
    7. Yuan Fang & Dimitris Karlis & Sanjeena Subedi, 2022. "Infinite Mixtures of Multivariate Normal-Inverse Gaussian Distributions for Clustering of Skewed Data," Journal of Classification, Springer;The Classification Society, vol. 39(3), pages 510-552, November.
    8. Villani, Mattias & Kohn, Robert & Nott, David J., 2012. "Generalized smooth finite mixtures," Journal of Econometrics, Elsevier, vol. 171(2), pages 121-133.
    9. Gabriele Perrone & Gabriele Soffritti, 2023. "Seemingly unrelated clusterwise linear regression for contaminated data," Statistical Papers, Springer, vol. 64(3), pages 883-921, June.
    10. Sijia Xiang & Weixin Yao, 2018. "Semiparametric mixtures of nonparametric regressions," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 70(1), pages 131-154, February.
    11. Gustavo Alexis Sabillón & Luiz Gabriel Fernandes Cotrim & Daiane Aparecida Zuanetti, 2023. "A data-driven reversible jump for estimating a finite mixture of regression models," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(1), pages 350-369, March.
    12. Ang Shan & Fengkai Yang, 2021. "Bayesian Inference for Finite Mixture Regression Model Based on Non-Iterative Algorithm," Mathematics, MDPI, vol. 9(6), pages 1-13, March.
    13. Yanyuan Ma & Shaoli Wang & Lin Xu & Weixin Yao, 2021. "Semiparametric mixture regression with unspecified error distributions," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(2), pages 429-444, June.
    14. Yuzhu Tian & Manlai Tang & Maozai Tian, 2016. "A class of finite mixture of quantile regressions with its applications," Journal of Applied Statistics, Taylor & Francis Journals, vol. 43(7), pages 1240-1252, July.
    15. Roy Costilla & Ivy Liu & Richard Arnold & Daniel Fernández, 2019. "Bayesian model-based clustering for longitudinal ordinal data," Computational Statistics, Springer, vol. 34(3), pages 1015-1038, September.
    16. Saverio Ranciati & Giuliano Galimberti & Gabriele Soffritti, 2019. "Bayesian variable selection in linear regression models with non-normal errors," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 28(2), pages 323-358, June.
    17. Tsionas, Mike G. & Izzeldin, Marwan & Trapani, Lorenzo, 2022. "Estimation of large dimensional time varying VARs using copulas," European Economic Review, Elsevier, vol. 141(C).
    18. Sanjeena Subedi & Paul D. McNicholas, 2021. "A Variational Approximations-DIC Rubric for Parameter Estimation and Mixture Model Selection Within a Family Setting," Journal of Classification, Springer;The Classification Society, vol. 38(1), pages 89-108, April.
    19. Wang, Shaoli & Yao, Weixin & Huang, Mian, 2014. "A note on the identifiability of nonparametric and semiparametric mixtures of GLMs," Statistics & Probability Letters, Elsevier, vol. 93(C), pages 41-45.
    20. Naderi, Mehrdad & Mirfarah, Elham & Wang, Wan-Lun & Lin, Tsung-I, 2023. "Robust mixture regression modeling based on the normal mean-variance mixture distributions," Computational Statistics & Data Analysis, Elsevier, vol. 180(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:advdac:v:17:y:2023:i:3:d:10.1007_s11634-022-00523-5. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.