IDEAS home Printed from
   My bibliography  Save this article

Principal Covariates Clusterwise Regression (PCCR): Accounting for Multicollinearity and Population Heterogeneity in Hierarchically Organized Data


  • Tom Frans Wilderjans

    () (Leiden University
    KU Leuven)

  • Eva Gaer

    (KU Leuven)

  • Henk A. L. Kiers

    (University of Groningen)

  • Iven Mechelen

    (KU Leuven)

  • Eva Ceulemans

    (KU Leuven)


Abstract In the behavioral sciences, many research questions pertain to a regression problem in that one wants to predict a criterion on the basis of a number of predictors. Although in many cases, ordinary least squares regression will suffice, sometimes the prediction problem is more challenging, for three reasons: first, multiple highly collinear predictors can be available, making it difficult to grasp their mutual relations as well as their relations to the criterion. In that case, it may be very useful to reduce the predictors to a few summary variables, on which one regresses the criterion and which at the same time yields insight into the predictor structure. Second, the population under study may consist of a few unknown subgroups that are characterized by different regression models. Third, the obtained data are often hierarchically structured, with for instance, observations being nested into persons or participants within groups or countries. Although some methods have been developed that partially meet these challenges (i.e., principal covariates regression (PCovR), clusterwise regression (CR), and structural equation models), none of these methods adequately deals with all of them simultaneously. To fill this gap, we propose the principal covariates clusterwise regression (PCCR) method, which combines the key idea’s behind PCovR (de Jong & Kiers in Chemom Intell Lab Syst 14(1–3):155–164, 1992) and CR (Späth in Computing 22(4):367–373, 1979). The PCCR method is validated by means of a simulation study and by applying it to cross-cultural data regarding satisfaction with life.

Suggested Citation

  • Tom Frans Wilderjans & Eva Gaer & Henk A. L. Kiers & Iven Mechelen & Eva Ceulemans, 2017. "Principal Covariates Clusterwise Regression (PCCR): Accounting for Multicollinearity and Population Heterogeneity in Hierarchically Organized Data," Psychometrika, Springer;The Psychometric Society, vol. 82(1), pages 86-111, March.
  • Handle: RePEc:spr:psycho:v:82:y:2017:i:1:d:10.1007_s11336-016-9522-0
    DOI: 10.1007/s11336-016-9522-0

    Download full text from publisher

    File URL:
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    1. Marko Sarstedt & Christian Ringle, 2010. "Treating unobserved heterogeneity in PLS path modeling: a comparison of FIMIX-PLS with different data analysis strategies," Journal of Applied Statistics, Taylor & Francis Journals, vol. 37(8), pages 1299-1318.
    2. Carsten Hahn & Michael D. Johnson & Andreas Herrmann & Frank Huber, 2002. "Capturing Customer Heterogeneity Using A Finite Mixture Pls Approach," Schmalenbach Business Review (sbr), LMU Munich School of Management, vol. 54(3), pages 243-269, July.
    3. Eva Ceulemans & Iven Mechelen, 2008. "CLASSI: A classification model for the study of sequential processes and individual differences therein," Psychometrika, Springer;The Psychometric Society, vol. 73(1), pages 107-124, March.
    4. Wilderjans, Tom & Ceulemans, Eva & Van Mechelen, Iven, 2009. "Simultaneous analysis of coupled data blocks differing in size: A comparison of two weighting schemes," Computational Statistics & Data Analysis, Elsevier, vol. 53(4), pages 1086-1098, February.
    5. Henk Kiers & Age Smilde, 2007. "A comparison of various methods for multivariate regression with highly collinear variables," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 16(2), pages 193-228, August.
    6. Henry Kaiser, 1958. "The varimax criterion for analytic rotation in factor analysis," Psychometrika, Springer;The Psychometric Society, vol. 23(3), pages 187-200, September.
    7. Leisch, Friedrich, 2004. "FlexMix: A General Framework for Finite Mixture Models and Latent Class Regression in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 11(i08).
    8. Wayne DeSarbo & Richard Oliver & Arvind Rangaswamy, 1989. "A simulated annealing methodology for clusterwise linear regression," Psychometrika, Springer;The Psychometric Society, vol. 54(4), pages 707-736, September.
    9. Lawrence Hubert & Phipps Arabie, 1985. "Comparing partitions," Journal of Classification, Springer;The Classification Society, vol. 2(1), pages 193-218, December.
    10. Wayne DeSarbo & William Cron, 1988. "A maximum likelihood methodology for clusterwise linear regression," Journal of Classification, Springer;The Classification Society, vol. 5(2), pages 249-282, September.
    11. Michel Wedel & Wayne DeSarbo, 1995. "A mixture likelihood approach for generalized linear models," Journal of Classification, Springer;The Classification Society, vol. 12(1), pages 21-55, March.
    12. Jos Berge, 1977. "Orthogonal procrustes rotation for two or more matrices," Psychometrika, Springer;The Psychometric Society, vol. 42(2), pages 267-276, June.
    13. Bruce Korth & Ledyard Tucker, 1975. "The distribution of chance congruence coefficients from simulated data," Psychometrika, Springer;The Psychometric Society, vol. 40(3), pages 361-372, September.
    14. Henk Kiers & Jos Berge, 1992. "Minimization of a class of matrix trace functions by means of refined majorization," Psychometrika, Springer;The Psychometric Society, vol. 57(3), pages 371-382, September.
    15. Michael Brusco & J. Cradit, 2001. "A variable-selection heuristic for K-means clustering," Psychometrika, Springer;The Psychometric Society, vol. 66(2), pages 249-270, June.
    16. Eva Ceulemans & Iven Mechelen & Iwin Leenen, 2007. "The Local Minima Problem in Hierarchical Classes Analysis: An Evaluation of a Simulated Annealing Algorithm and Various Multistart Procedures," Psychometrika, Springer;The Psychometric Society, vol. 72(3), pages 377-391, September.
    Full references (including those not matched with items on IDEAS)


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:psycho:v:82:y:2017:i:1:d:10.1007_s11336-016-9522-0. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Sonal Shukla) or (Rebekah McClure). General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.