IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v93y2016icp46-75.html
   My bibliography  Save this article

Mixture-based clustering for the ordered stereotype model

Author

Listed:
  • Fernández, D.
  • Arnold, R.
  • Pledger, S.

Abstract

Many of the methods which deal with the reduction of dimensionality in matrices of data are based on mathematical techniques such as distance-based algorithms or matrix decomposition and eigenvalues. Recently a group of likelihood-based finite mixture models for a data matrix with binary or count data, using basic Bernoulli or Poisson building blocks has been developed. This is extended and establishes likelihood-based multivariate methods for a data matrix with ordinal data which applies fuzzy clustering via finite mixtures to the ordered stereotype model. Model-fitting is performed using the expectation–maximization (EM) algorithm, and a fuzzy allocation of rows, columns, and rows and columns simultaneously to corresponding clusters is obtained. A simulation study is presented which includes a variety of scenarios in order to test the reliability of the proposed model. Finally, the results of the application of the model in two real data sets are shown.

Suggested Citation

  • Fernández, D. & Arnold, R. & Pledger, S., 2016. "Mixture-based clustering for the ordered stereotype model," Computational Statistics & Data Analysis, Elsevier, vol. 93(C), pages 46-75.
  • Handle: RePEc:eee:csdana:v:93:y:2016:i:c:p:46-75
    DOI: 10.1016/j.csda.2014.11.004
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S016794731400317X
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2014.11.004?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Kuss, Oliver, 2006. "On the estimation of the stereotype regression model," Computational Statistics & Data Analysis, Elsevier, vol. 50(8), pages 1877-1890, April.
    2. Rocci, Roberto & Vichi, Maurizio, 2008. "Two-mode multi-partitioning," Computational Statistics & Data Analysis, Elsevier, vol. 52(4), pages 1984-2003, January.
    3. Bohning, Dankmar & Seidel, Wilfried & Alfo, Macro & Garel, Bernard & Patilea, Valentin & Walther, Gunther, 2007. "Advances in Mixture Models," Computational Statistics & Data Analysis, Elsevier, vol. 51(11), pages 5205-5210, July.
    4. Ivy Liu & Alan Agresti, 2005. "The analysis of ordered categorical data: An overview and a survey of recent developments," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 14(1), pages 1-73, June.
    5. Shirley Pledger, 2000. "Unified Maximum Likelihood Estimates for Closed Capture–Recapture Models Using Mixtures," Biometrics, The International Biometric Society, vol. 56(2), pages 434-442, June.
    6. Hamparsum Bozdogan, 1987. "Model selection and Akaike's Information Criterion (AIC): The general theory and its analytical extensions," Psychometrika, Springer;The Psychometric Society, vol. 52(3), pages 345-370, September.
    7. Pledger, Shirley & Arnold, Richard, 2014. "Multivariate methods using mixtures: Correspondence analysis, scaling and pattern-detection," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 241-261.
    8. Richard Arnold & Yu Hayakawa & Paul Yip, 2010. "Capture–Recapture Estimation Using Finite Mixtures of Arbitrary Dimension," Biometrics, The International Biometric Society, vol. 66(2), pages 644-655, June.
    9. Stephen Johnson, 1967. "Hierarchical clustering schemes," Psychometrika, Springer;The Psychometric Society, vol. 32(3), pages 241-254, September.
    10. McQuarrie, Allan & Shumway, Robert & Tsai, Chih-Ling, 1997. "The model selection criterion AICu," Statistics & Probability Letters, Elsevier, vol. 34(3), pages 285-292, June.
    11. Christian Hennig & Tim F. Liao, 2013. "How to find an appropriate clustering for mixed-type variables with application to socio-economic stratification," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 62(3), pages 309-369, May.
    12. Wayne DeSarbo & Duncan Fong & John Liechty & M. Kim Saxton, 2004. "A hierarchical bayesian procedure for two-mode cluster analysis," Psychometrika, Springer;The Psychometric Society, vol. 69(4), pages 547-572, December.
    13. G. J. McLachlan, 1987. "On Bootstrapping the Likelihood Ratio Test Statistic for the Number of Components in a Normal Mixture," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 36(3), pages 318-324, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Daniel Fernández & Radim J. Sram & Miroslav Dostal & Anna Pastorkova & Hans Gmuender & Hyunok Choi, 2018. "Modeling Unobserved Heterogeneity in Susceptibility to Ambient Benzo[ a ]pyrene Concentration among Children with Allergic Asthma Using an Unsupervised Learning Algorithm," IJERPH, MDPI, vol. 15(1), pages 1-18, January.
    2. Tatjana Miljkovic & Daniel Fernández, 2018. "On Two Mixture-Based Clustering Approaches Used in Modeling an Insurance Portfolio," Risks, MDPI, vol. 6(2), pages 1-18, May.
    3. Daniel Fernández & Richard Arnold & Shirley Pledger & Ivy Liu & Roy Costilla, 2019. "Finite mixture biclustering of discrete type multivariate data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(1), pages 117-143, March.
    4. Roy Costilla & Ivy Liu & Richard Arnold & Daniel Fernández, 2019. "Bayesian model-based clustering for longitudinal ordinal data," Computational Statistics, Springer, vol. 34(3), pages 1015-1038, September.
    5. Daniel Fernández & Louise McMillan & Richard Arnold & Martin Spiess & Ivy Liu, 2022. "Goodness-of-Fit and Generalized Estimating Equation Methods for Ordinal Responses Based on the Stereotype Model," Stats, MDPI, vol. 5(2), pages 1-14, June.
    6. Christian Carmona & Luis Nieto-Barajas & Antonio Canale, 2019. "Model-based approach for household clustering with mixed scale variables," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(2), pages 559-583, June.
    7. Jacques, Julien & Biernacki, Christophe, 2018. "Model-based co-clustering for ordinal data," Computational Statistics & Data Analysis, Elsevier, vol. 123(C), pages 101-115.
    8. Álvarez de Toledo, Pablo & Núñez, Fernando & Usabiaga, Carlos, 2018. "Matching and clustering in square contingency tables. Who matches with whom in the Spanish labour market," Computational Statistics & Data Analysis, Elsevier, vol. 127(C), pages 135-159.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Daniel Fernández & Richard Arnold & Shirley Pledger & Ivy Liu & Roy Costilla, 2019. "Finite mixture biclustering of discrete type multivariate data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(1), pages 117-143, March.
    2. Daniel Fernández & Radim J. Sram & Miroslav Dostal & Anna Pastorkova & Hans Gmuender & Hyunok Choi, 2018. "Modeling Unobserved Heterogeneity in Susceptibility to Ambient Benzo[ a ]pyrene Concentration among Children with Allergic Asthma Using an Unsupervised Learning Algorithm," IJERPH, MDPI, vol. 15(1), pages 1-18, January.
    3. Eleni Matechou & Ivy Liu & Daniel Fernández & Miguel Farias & Bergljot Gjelsvik, 2016. "Biclustering Models for Two-Mode Ordinal Data," Psychometrika, Springer;The Psychometric Society, vol. 81(3), pages 611-624, September.
    4. Álvarez de Toledo, Pablo & Núñez, Fernando & Usabiaga, Carlos, 2018. "Matching and clustering in square contingency tables. Who matches with whom in the Spanish labour market," Computational Statistics & Data Analysis, Elsevier, vol. 127(C), pages 135-159.
    5. Tatjana Miljkovic & Daniel Fernández, 2018. "On Two Mixture-Based Clustering Approaches Used in Modeling an Insurance Portfolio," Risks, MDPI, vol. 6(2), pages 1-18, May.
    6. Pledger, Shirley & Arnold, Richard, 2014. "Multivariate methods using mixtures: Correspondence analysis, scaling and pattern-detection," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 241-261.
    7. Salvatore Ingrassia & Antonio Punzo & Giorgio Vittadini & Simona Minotti, 2015. "Erratum to: The Generalized Linear Mixed Cluster-Weighted Model," Journal of Classification, Springer;The Classification Society, vol. 32(2), pages 327-355, July.
    8. Roy Costilla & Ivy Liu & Richard Arnold & Daniel Fernández, 2019. "Bayesian model-based clustering for longitudinal ordinal data," Computational Statistics, Springer, vol. 34(3), pages 1015-1038, September.
    9. Salvatore Ingrassia & Antonio Punzo & Giorgio Vittadini & Simona Minotti, 2015. "The Generalized Linear Mixed Cluster-Weighted Model," Journal of Classification, Springer;The Classification Society, vol. 32(1), pages 85-113, April.
    10. Michel Wedel & Wayne DeSarbo, 1995. "A mixture likelihood approach for generalized linear models," Journal of Classification, Springer;The Classification Society, vol. 12(1), pages 21-55, March.
    11. Simon Blanchard & Wayne DeSarbo, 2013. "A New Zero-Inflated Negative Binomial Methodology for Latent Category Identification," Psychometrika, Springer;The Psychometric Society, vol. 78(2), pages 322-340, April.
    12. Wayne DeSarbo & Venkatram Ramaswamy & Peter Lenk, 1993. "A latent class procedure for the structural analysis of two-way compositional data," Journal of Classification, Springer;The Classification Society, vol. 10(2), pages 159-193, December.
    13. Bocci, Laura & Vicari, Donatella & Vichi, Maurizio, 2006. "A mixture model for the classification of three-way proximity data," Computational Statistics & Data Analysis, Elsevier, vol. 50(7), pages 1625-1654, April.
    14. repec:dgr:rugsom:96b34 is not listed on IDEAS
    15. R. Scott Hacker & Abdulnasser Hatemi-J, 2021. "Model selection in time series analysis: using information criteria as an alternative to hypothesis testing," Journal of Economic Studies, Emerald Group Publishing Limited, vol. 49(6), pages 1055-1075, September.
    16. Morgan, Grant B. & Hodge, Kari J. & Baggett, Aaron R., 2016. "Latent profile analysis with nonnormal mixtures: A Monte Carlo examination of model selection using fit indices," Computational Statistics & Data Analysis, Elsevier, vol. 93(C), pages 146-161.
    17. Lu, Zhenqiu (Laura) & Zhang, Zhiyong, 2014. "Robust growth mixture models with non-ignorable missingness: Models, estimation, selection, and application," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 220-240.
    18. Wayne DeSarbo & Rabikar Chatterjee & Juyoung Kim, 1994. "Deriving ultrametric tree structures from proximity data confounded by differential stimulus familiarity," Psychometrika, Springer;The Psychometric Society, vol. 59(4), pages 527-566, December.
    19. Roy Levy & Gregory R. Hancock, 2011. "An Extended Model Comparison Framework for Covariance and Mean Structure Models, Accommodating Multiple Groups and Latent Mixtures," Sociological Methods & Research, , vol. 40(2), pages 256-278, May.
    20. Xiaoqiong Fang & Andy W. Chen & Derek S. Young, 2023. "Predictors with measurement error in mixtures of polynomial regressions," Computational Statistics, Springer, vol. 38(1), pages 373-401, March.
    21. Martin Young & Wayne DeSarbo, 1995. "A parametric procedure for ultrametric tree estimation from conditional rank order proximity data," Psychometrika, Springer;The Psychometric Society, vol. 60(1), pages 47-75, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:93:y:2016:i:c:p:46-75. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.