IDEAS home Printed from https://ideas.repec.org/a/spr/psycho/v82y2017i1d10.1007_s11336-016-9522-0.html
   My bibliography  Save this article

Principal Covariates Clusterwise Regression (PCCR): Accounting for Multicollinearity and Population Heterogeneity in Hierarchically Organized Data

Author

Listed:
  • Tom Frans Wilderjans

    (Leiden University
    KU Leuven)

  • Eva Gaer

    (KU Leuven)

  • Henk A. L. Kiers

    (University of Groningen)

  • Iven Mechelen

    (KU Leuven)

  • Eva Ceulemans

    (KU Leuven)

Abstract

In the behavioral sciences, many research questions pertain to a regression problem in that one wants to predict a criterion on the basis of a number of predictors. Although in many cases, ordinary least squares regression will suffice, sometimes the prediction problem is more challenging, for three reasons: first, multiple highly collinear predictors can be available, making it difficult to grasp their mutual relations as well as their relations to the criterion. In that case, it may be very useful to reduce the predictors to a few summary variables, on which one regresses the criterion and which at the same time yields insight into the predictor structure. Second, the population under study may consist of a few unknown subgroups that are characterized by different regression models. Third, the obtained data are often hierarchically structured, with for instance, observations being nested into persons or participants within groups or countries. Although some methods have been developed that partially meet these challenges (i.e., principal covariates regression (PCovR), clusterwise regression (CR), and structural equation models), none of these methods adequately deals with all of them simultaneously. To fill this gap, we propose the principal covariates clusterwise regression (PCCR) method, which combines the key idea’s behind PCovR (de Jong & Kiers in Chemom Intell Lab Syst 14(1–3):155–164, 1992) and CR (Späth in Computing 22(4):367–373, 1979). The PCCR method is validated by means of a simulation study and by applying it to cross-cultural data regarding satisfaction with life.

Suggested Citation

  • Tom Frans Wilderjans & Eva Gaer & Henk A. L. Kiers & Iven Mechelen & Eva Ceulemans, 2017. "Principal Covariates Clusterwise Regression (PCCR): Accounting for Multicollinearity and Population Heterogeneity in Hierarchically Organized Data," Psychometrika, Springer;The Psychometric Society, vol. 82(1), pages 86-111, March.
  • Handle: RePEc:spr:psycho:v:82:y:2017:i:1:d:10.1007_s11336-016-9522-0
    DOI: 10.1007/s11336-016-9522-0
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11336-016-9522-0
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11336-016-9522-0?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Henry Kaiser, 1958. "The varimax criterion for analytic rotation in factor analysis," Psychometrika, Springer;The Psychometric Society, vol. 23(3), pages 187-200, September.
    2. Leisch, Friedrich, 2004. "FlexMix: A General Framework for Finite Mixture Models and Latent Class Regression in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 11(i08).
    3. Marko Sarstedt & Christian Ringle, 2010. "Treating unobserved heterogeneity in PLS path modeling: a comparison of FIMIX-PLS with different data analysis strategies," Journal of Applied Statistics, Taylor & Francis Journals, vol. 37(8), pages 1299-1318.
    4. Carsten Hahn & Michael D. Johnson & Andreas Herrmann & Frank Huber, 2002. "Capturing Customer Heterogeneity Using A Finite Mixture Pls Approach," Schmalenbach Business Review (sbr), LMU Munich School of Management, vol. 54(3), pages 243-269, July.
    5. Eva Ceulemans & Iven Mechelen, 2008. "CLASSI: A classification model for the study of sequential processes and individual differences therein," Psychometrika, Springer;The Psychometric Society, vol. 73(1), pages 107-124, March.
    6. Wayne DeSarbo & Richard Oliver & Arvind Rangaswamy, 1989. "A simulated annealing methodology for clusterwise linear regression," Psychometrika, Springer;The Psychometric Society, vol. 54(4), pages 707-736, September.
    7. Lawrence Hubert & Phipps Arabie, 1985. "Comparing partitions," Journal of Classification, Springer;The Classification Society, vol. 2(1), pages 193-218, December.
    8. Wayne DeSarbo & William Cron, 1988. "A maximum likelihood methodology for clusterwise linear regression," Journal of Classification, Springer;The Classification Society, vol. 5(2), pages 249-282, September.
    9. Michel Wedel & Wayne DeSarbo, 1995. "A mixture likelihood approach for generalized linear models," Journal of Classification, Springer;The Classification Society, vol. 12(1), pages 21-55, March.
    10. Wilderjans, Tom & Ceulemans, Eva & Van Mechelen, Iven, 2009. "Simultaneous analysis of coupled data blocks differing in size: A comparison of two weighting schemes," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 1086-1098, February.
    11. Jos Berge, 1977. "Orthogonal procrustes rotation for two or more matrices," Psychometrika, Springer;The Psychometric Society, vol. 42(2), pages 267-276, June.
    12. Bruce Korth & Ledyard Tucker, 1975. "The distribution of chance congruence coefficients from simulated data," Psychometrika, Springer;The Psychometric Society, vol. 40(3), pages 361-372, September.
    13. Henk Kiers & Jos Berge, 1992. "Minimization of a class of matrix trace functions by means of refined majorization," Psychometrika, Springer;The Psychometric Society, vol. 57(3), pages 371-382, September.
    14. Michael Brusco & J. Cradit, 2001. "A variable-selection heuristic for K-means clustering," Psychometrika, Springer;The Psychometric Society, vol. 66(2), pages 249-270, June.
    15. Eva Ceulemans & Iven Mechelen & Iwin Leenen, 2007. "The Local Minima Problem in Hierarchical Classes Analysis: An Evaluation of a Simulated Annealing Algorithm and Various Multistart Procedures," Psychometrika, Springer;The Psychometric Society, vol. 72(3), pages 377-391, September.
    16. Henk Kiers & Age Smilde, 2007. "A comparison of various methods for multivariate regression with highly collinear variables," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 16(2), pages 193-228, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Xavier Bry & Ndèye Niang & Thomas Verron & Stéphanie Bougeard, 2023. "Clusterwise elastic-net regression based on a combined information criterion," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(1), pages 75-107, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jan-Michael Becker & Christian Ringle & Marko Sarstedt & Franziska Völckner, 2015. "How collinearity affects mixture regression results," Marketing Letters, Springer, vol. 26(4), pages 643-659, December.
    2. Wayne S. DeSarbo & Qian Chen & Ashley Stadler Blank, 2017. "A Parametric Constrained Segmentation Methodology for Application in Sport Marketing," Customer Needs and Solutions, Springer;Institute for Sustainable Innovation and Growth (iSIG), vol. 4(4), pages 37-55, December.
    3. Salvatore Ingrassia & Simona Minotti & Giorgio Vittadini, 2012. "Local Statistical Modeling via a Cluster-Weighted Approach with Elliptical Distributions," Journal of Classification, Springer;The Classification Society, vol. 29(3), pages 363-401, October.
    4. Salvatore Ingrassia & Antonio Punzo, 2020. "Cluster Validation for Mixtures of Regressions via the Total Sum of Squares Decomposition," Journal of Classification, Springer;The Classification Society, vol. 37(2), pages 526-547, July.
    5. Chen, Cathy W.S. & Chan, Jennifer S.K. & So, Mike K.P. & Lee, Kevin K.M., 2011. "Classification in segmented regression problems," Computational Statistics & Data Analysis, Elsevier, vol. 55(7), pages 2276-2287, July.
    6. Salvatore D. Tomarchio & Paul D. McNicholas & Antonio Punzo, 2021. "Matrix Normal Cluster-Weighted Models," Journal of Classification, Springer;The Classification Society, vol. 38(3), pages 556-575, October.
    7. Sanjeena Subedi & Antonio Punzo & Salvatore Ingrassia & Paul McNicholas, 2013. "Clustering and classification via cluster-weighted factor analyzers," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 7(1), pages 5-40, March.
    8. Réal Carbonneau & Gilles Caporossi & Pierre Hansen, 2014. "Globally Optimal Clusterwise Regression By Column Generation Enhanced with Heuristics, Sequencing and Ending Subset Optimization," Journal of Classification, Springer;The Classification Society, vol. 31(2), pages 219-241, July.
    9. Teague R. Henry & Kathleen M. Gates & Mitchell J. Prinstein & Douglas Steinley, 2020. "Modeling Heterogeneous Peer Assortment Effects Using Finite Mixture Exponential Random Graph Models," Psychometrika, Springer;The Psychometric Society, vol. 85(1), pages 8-34, March.
    10. Stéphanie Bougeard & Hervé Abdi & Gilbert Saporta & Ndèye Niang, 2018. "Clusterwise analysis for multiblock component methods," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 12(2), pages 285-313, June.
    11. Roberto Mari & Salvatore Ingrassia & Antonio Punzo, 2023. "Local and Overall Deviance R-Squared Measures for Mixtures of Generalized Linear Models," Journal of Classification, Springer;The Classification Society, vol. 40(2), pages 233-266, July.
    12. Michael Brusco & Hans-Friedrich Köhn, 2009. "Exemplar-Based Clustering via Simulated Annealing," Psychometrika, Springer;The Psychometric Society, vol. 74(3), pages 457-475, September.
    13. Joki, Kaisa & Bagirov, Adil M. & Karmitsa, Napsu & Mäkelä, Marko M. & Taheri, Sona, 2020. "Clusterwise support vector linear regression," European Journal of Operational Research, Elsevier, vol. 287(1), pages 19-35.
    14. Utkarsh J. Dang & Antonio Punzo & Paul D. McNicholas & Salvatore Ingrassia & Ryan P. Browne, 2017. "Multivariate Response and Parsimony for Gaussian Cluster-Weighted Models," Journal of Classification, Springer;The Classification Society, vol. 34(1), pages 4-34, April.
    15. Xavier Bry & Ndèye Niang & Thomas Verron & Stéphanie Bougeard, 2023. "Clusterwise elastic-net regression based on a combined information criterion," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(1), pages 75-107, March.
    16. Eva Vande Gaer & Eva Ceulemans & Iven Mechelen & Peter Kuppens, 2012. "The CLASSI-N Method for the Study of Sequential Processes," Psychometrika, Springer;The Psychometric Society, vol. 77(1), pages 85-105, January.
    17. Ana Oliveira-Brochado & Francisco Vitorino Martins, 2008. "Segmentação de Mercado e modelos mistura de regressão para variáveis normais," FEP Working Papers 262, Universidade do Porto, Faculdade de Economia do Porto.
    18. Carbonneau, Réal A. & Caporossi, Gilles & Hansen, Pierre, 2011. "Globally optimal clusterwise regression by mixed logical-quadratic programming," European Journal of Operational Research, Elsevier, vol. 212(1), pages 213-222, July.
    19. Hye Suk & Heungsun Hwang, 2010. "Regularized fuzzy clusterwise ridge regression," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 4(1), pages 35-51, April.
    20. Fordellone, Mario & Vichi, Maurizio, 2020. "Finding groups in structural equation modeling through the partial least squares algorithm," Computational Statistics & Data Analysis, Elsevier, vol. 147(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:psycho:v:82:y:2017:i:1:d:10.1007_s11336-016-9522-0. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.