IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v253y2016i3p659-672.html
   My bibliography  Save this article

A model for clustering data from heterogeneous dissimilarities

Author

Listed:
  • Santi, Éverton
  • Aloise, Daniel
  • Blanchard, Simon J.

Abstract

Clustering algorithms partition a set of n objects into p groups (called clusters), such that objects assigned to the same groups are homogeneous according to some criteria. To derive these clusters, the data input required is often a single n × n dissimilarity matrix. Yet for many applications, more than one instance of the dissimilarity matrix is available and so to conform to model requirements, it is common practice to aggregate (e.g., sum up, average) the matrices. This aggregation practice results in clustering solutions that mask the true nature of the original data. In this paper we introduce a clustering model which, to handle the heterogeneity, uses all available dissimilarity matrices and identifies for groups of individuals clustering objects in a similar way. The model is a nonconvex problem and difficult to solve exactly, and we thus introduce a Variable Neighborhood Search heuristic to provide solutions efficiently. Computational experiments and an empirical application to perception of chocolate candy show that the heuristic algorithm is efficient and that the proposed model is suited for recovering heterogeneous data. Implications for clustering researchers are discussed.

Suggested Citation

  • Santi, Éverton & Aloise, Daniel & Blanchard, Simon J., 2016. "A model for clustering data from heterogeneous dissimilarities," European Journal of Operational Research, Elsevier, vol. 253(3), pages 659-672.
  • Handle: RePEc:eee:ejores:v:253:y:2016:i:3:p:659-672
    DOI: 10.1016/j.ejor.2016.03.033
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221716301618
    Download Restriction: Full text for ScienceDirect subscribers only

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Wayne S. DeSarbo & A. Selin Atalay & David LeBaron & Simon J. Blanchard, 2008. "Estimating Multiple Consumer Segment Ideal Points from Context-Dependent Survey Data," Journal of Consumer Research, Oxford University Press, vol. 35(1), pages 142-153, March.
    2. Mladenovic, Nenad & Brimberg, Jack & Hansen, Pierre & Moreno-Perez, Jose A., 2007. "The p-median problem: A survey of metaheuristic approaches," European Journal of Operational Research, Elsevier, vol. 179(3), pages 927-939, June.
    3. Simon Blanchard & Wayne DeSarbo & A. Atalay & Nukhet Harmancioglu, 2012. "Identifying consumer heterogeneity in unobserved categories," Marketing Letters, Springer, vol. 23(1), pages 177-194, March.
    4. Shugan, Steven M, 1980. " The Cost of Thinking," Journal of Consumer Research, Oxford University Press, vol. 7(2), pages 99-111, Se.
    5. Simon Blanchard & Wayne DeSarbo, 2013. "A New Zero-Inflated Negative Binomial Methodology for Latent Category Identification," Psychometrika, Springer;The Psychometric Society, vol. 78(2), pages 322-340, April.
    6. Herbert A. Simon, 1955. "A Behavioral Model of Rational Choice," The Quarterly Journal of Economics, Oxford University Press, vol. 69(1), pages 99-118.
    7. Douglas Steinley & Gretchen Hendrickson & Michael Brusco, 2015. "A Note on Maximizing the Agreement Between Partitions: A Stepwise Optimal Algorithm and Some Properties," Journal of Classification, Springer;The Classification Society, vol. 32(1), pages 114-126, April.
    8. Hansen, Pierre & Mladenovic, Nenad, 2001. "Variable neighborhood search: Principles and applications," European Journal of Operational Research, Elsevier, vol. 130(3), pages 449-467, May.
    9. Ruth Misener & Christodoulos Floudas, 2013. "GloMIQO: Global mixed-integer quadratic optimizer," Journal of Global Optimization, Springer, vol. 57(1), pages 3-50, September.
    10. Bettman, James R & Park, C Whan, 1980. " Effects of Prior Knowledge and Experience and Phase of the Choice Process on Consumer Decision Processes: A Protocol Analysis," Journal of Consumer Research, Oxford University Press, vol. 7(3), pages 234-248, December.
    11. Pierre Hansen & Nenad Mladenović & José Moreno Pérez, 2010. "Variable neighbourhood search: methods and applications," Annals of Operations Research, Springer, vol. 175(1), pages 367-407, March.
    12. Cait Poynor Lamberton & Kristin Diehl, 2013. "Retail Choice Architecture: The Effects of Benefit- and Attribute-Based Assortment Organization on Consumer Perceptions and Choice," Journal of Consumer Research, Oxford University Press, vol. 40(3), pages 393-411.
    13. Bettman, James R & Luce, Mary Frances & Payne, John W, 1998. " Constructive Consumer Choice Processes," Journal of Consumer Research, Oxford University Press, vol. 25(3), pages 187-217, December.
    14. Rebecca Hamilton & Debora Thompson & Zachary Arens & Simon Blanchard & Gerald Häubl & P. Kannan & Uzma Khan & Donald Lehmann & Margaret Meloy & Neal Roese & Manoj Thomas, 2014. "Consumer substitution decisions: an integrative framework," Marketing Letters, Springer, vol. 25(3), pages 305-317, September.
    15. Simon Blanchard & Daniel Aloise & Wayne DeSarbo, 2012. "The Heterogeneous P-Median Problem for Categorization Based Clustering," Psychometrika, Springer;The Psychometric Society, vol. 77(4), pages 741-762, October.
    16. Lawrence Hubert & Phipps Arabie, 1985. "Comparing partitions," Journal of Classification, Springer;The Classification Society, vol. 2(1), pages 193-218, December.
    17. Sáez-Aguado, Jesús & Trandafir, Paula Camelia, 2012. "Some heuristic methods for solving p-median problems with a coverage constraint," European Journal of Operational Research, Elsevier, vol. 220(2), pages 320-327.
    18. Swait, Joffre & Brigden, Neil & Johnson, Richard D., 2014. "Categories shape preferences: A model of taste heterogeneity arising from categorization of alternatives," Journal of choice modelling, Elsevier, vol. 13(C), pages 3-23.
    19. Wayne DeSarbo & J. Douglas Carroll, 1985. "Three-way metric unfolding via alternating weighted least squares," Psychometrika, Springer;The Psychometric Society, vol. 50(3), pages 275-300, September.
    20. Michael Brusco & J. Cradit, 2001. "A variable-selection heuristic for K-means clustering," Psychometrika, Springer;The Psychometric Society, vol. 66(2), pages 249-270, June.
    21. X. Zheng & X. Sun & D. Li, 2011. "Nonconvex quadratically constrained quadratic programming: best D.C. decompositions and their SDP representations," Journal of Global Optimization, Springer, vol. 50(4), pages 695-712, August.
    22. Park, C Whan & Iyer, Easwar S & Smith, Daniel C, 1989. " The Effects of Situational Factors on In-Store Grocery Shopping Behavior: The Role of Store Environment and Time Available for Shopping," Journal of Consumer Research, Oxford University Press, vol. 15(4), pages 422-433, March.
    23. Maurizio Vichi & Roberto Rocci & Henk A.L. Kiers, 2007. "Simultaneous Component and Clustering Models for Three-way Data: Within and Between Approaches," Journal of Classification, Springer;The Classification Society, vol. 24(1), pages 71-98, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Huerta-Muñoz, Diana L. & Ríos-Mercado, Roger Z. & Ruiz, Rubén, 2017. "An iterated greedy heuristic for a market segmentation problem with multiple attributes," European Journal of Operational Research, Elsevier, vol. 261(1), pages 75-87.
    2. repec:eee:ejores:v:262:y:2017:i:1:p:1-13 is not listed on IDEAS
    3. repec:eee:ejores:v:263:y:2017:i:2:p:367-379 is not listed on IDEAS

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:253:y:2016:i:3:p:659-672. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Dana Niculescu). General contact details of provider: http://www.elsevier.com/locate/eor .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.