IDEAS home Printed from
   My bibliography  Save this article

Finite-Mixture Structural Equation Models for Response-Based Segmentation and Unobserved Heterogeneity


  • Kamel Jedidi

    (Columbia University)

  • Harsharanjeet S. Jagpal

    (Rutgers University)

  • Wayne S. DeSarbo

    (Pennsylvania State University)


Two endemic problems face researchers in the social sciences (e.g., Marketing, Economics, Psychology, and Finance): unobserved heterogeneity and measurement error in data. Structural equation modeling is a powerful tool for dealing with these difficulties using a simultaneous equation framework with unobserved constructs and manifest indicators which are error-prone. When estimating structural equation models, however, researchers frequently treat the data as if they were collected from a single population (Muthén [Muthén, Bengt O. 1989. Latent variable modeling in heterogeneous populations. 557–585.]). This assumption of homogeneity is often unrealistic. For example, in multidimensional expectancy value models, consumers from different market segments can have different belief structures (Bagozzi [Bagozzi, Richard P. 1982. A field investigation of causal relations among cognitions, affect, intentions, and behavior. 562–584.]). Research in satisfaction suggests that consumer decision processes vary across segments (Day [Day, Ralph L. 1977. Extending the concept of consumer satisfaction. W. D. Perreault, ed. , Vol. 4. Association for Consumer Research, Atlanta, 149–154.]). This paper shows that aggregate analysis which ignores heterogeneity in structural equation models produces misleading results and that traditional fit statistics are not useful for detecting unobserved heterogeneity in the data. Furthermore, sequential analyses that first form groups using cluster analysis and then apply multigroup structural equation modeling are not satisfactory. We develop a general finite mixture structural equation model that simultaneously treats heterogeneity and forms market segments in the context of a specified model structure where all the observed variables are measured with error. The model is considerably more general than cluster analysis, multigroup confirmatory factor analysis, and multigroup structural equation modeling. In particular, the model subsumes several specialized models including finite mixture simultaneous equation models, finite mixture confirmatory factor analysis, and finite mixture second-order factor analysis. The finite mixture structural equation model should be of interest to academics in a wide range of disciplines (e.g., Consumer Behavior, Marketing, Economics, Finance, Psychology, and Sociology) where unobserved heterogeneity and measurement error are problematic. In addition, the model should be of interest to market researchers and product managers for two reasons. First, the model allows the manager to perform response-based segmentation using a consumer decision process model, while explicitly allowing for both measurement and structural error. Second, the model allows managers to detect unobserved moderating factors which account for heterogeneity. Once managers have identified the moderating factors, they can link segment membership to observable individual-level characteristics (e.g., socioeconomic and demographic variables) and improve marketing policy. We applied the finite mixture structural equation model to a direct marketing study of customer satisfaction and estimated a large model with 8 unobserved constructs and 23 manifest indicators. The results show that there are three consumer segments that vary considerably in terms of the importance they attach to the various dimensions of satisfaction. In contrast, aggregate analysis is misleading because it incorrectly suggests that except for price all dimensions of satisfaction are significant for all consumers. Methodologically, the finite mixture model is robust; that is, the parameter estimates are stable under double cross-validation and the method can be used to test large models. Furthermore, the double cross-validation results show that the finite mixture model is superior to sequential data analysis strategies in terms of goodness-of-fit and interpretability. We performed four simulation experiments to test the robustness of the algorithm using both recursive and nonrecursive model specifications. Specifically, we examined the robustness of different model selection criteria (e.g., CAIC, BIC, and GFI) in choosing the correct number of clusters for exactly identified and overidentified models assuming that the distributional form is correctly specified. We also examined the effect of distributional misspecification (i.e., departures from multivariate normality) on model performance. The results show that when the data are heterogeneous, the standard goodness-of-fit statistics for the aggregate model are not useful for detecting heterogeneity. Furthermore, parameter recovery is poor. For the finite mixture model, however, the BIC and CAIC criteria perform well in detecting heterogeneity and in identifying the true number of segments. In particular, parameter recovery for both the measurement and structural models is highly satisfactory. The finite mixture method is robust to distributional misspecification; in addition, the method significantly outperforms aggregate and sequential data analysis methods when the form of heterogeneity is misspecified (i.e., the true model has random coefficients). Researchers and practitioners should only use the mixture methodology when substantive theory supports the structural equation model, segmentation is infeasible, and theory suggests that the data are heterogeneous and belong to a finite number of unobserved groups. We expect these conditions to hold in many social science applications and, in particular, market segmentation studies. Future research should focus on large-scale simulation studies to test the structural equation mixture model using a wide range of models and statistical distributions. Theoretical research should extend the model by allowing the mixing proportions to depend on prior information and/or subject-specific variables. Finally, in order to provide a fuller treatment of heterogeneity, we need to develop a general random coefficient structural equation model. Such a model is presently unavailable in the statistical and psychometric literatures.

Suggested Citation

  • Kamel Jedidi & Harsharanjeet S. Jagpal & Wayne S. DeSarbo, 1997. "Finite-Mixture Structural Equation Models for Response-Based Segmentation and Unobserved Heterogeneity," Marketing Science, INFORMS, vol. 16(1), pages 39-59.
  • Handle: RePEc:inm:ormksc:v:16:y:1997:i:1:p:39-59

    Download full text from publisher

    File URL:
    Download Restriction: no


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:inm:ormksc:v:16:y:1997:i:1:p:39-59. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Matthew Walls). General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.