IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v71y2014icp241-261.html
   My bibliography  Save this article

Multivariate methods using mixtures: Correspondence analysis, scaling and pattern-detection

Author

Listed:
  • Pledger, Shirley
  • Arnold, Richard

Abstract

Matrices of binary or count data are modelled under a unified statistical framework using finite mixtures to group the rows and/or columns. These likelihood-based one-mode and two-mode fuzzy clusterings provide maximum likelihood estimation of parameters and the options of using likelihood ratio tests or information criteria for model comparison. Geometric developments focused on pattern detection give likelihood-based analogues of various techniques in multivariate analysis, including multidimensional scaling, association analysis, ordination, correspondence analysis, and the construction of biplots. Illustrative examples demonstrate the effectiveness of these visualisations for identifying patterns of ecological significance (e.g. abrupt versus slow species turnover).

Suggested Citation

  • Pledger, Shirley & Arnold, Richard, 2014. "Multivariate methods using mixtures: Correspondence analysis, scaling and pattern-detection," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 241-261.
  • Handle: RePEc:eee:csdana:v:71:y:2014:i:c:p:241-261
    DOI: 10.1016/j.csda.2013.05.013
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947313001862
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2013.05.013?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. van de Geer, Sara, 2003. "Asymptotic theory for maximum likelihood in nonparametric mixture models," Computational Statistics & Data Analysis, Elsevier, vol. 41(3-4), pages 453-464, January.
    2. Bohning, Dankmar & Seidel, Wilfried & Alfo, Macro & Garel, Bernard & Patilea, Valentin & Walther, Gunther, 2007. "Advances in Mixture Models," Computational Statistics & Data Analysis, Elsevier, vol. 51(11), pages 5205-5210, July.
    3. Schlattmann, Peter, 2003. "Estimating the number of components in a finite mixture model: the special case of homogeneity," Computational Statistics & Data Analysis, Elsevier, vol. 41(3-4), pages 441-451, January.
    4. Shirley Pledger, 2000. "Unified Maximum Likelihood Estimates for Closed Capture–Recapture Models Using Mixtures," Biometrics, The International Biometric Society, vol. 56(2), pages 434-442, June.
    5. Richard Arnold & Yu Hayakawa & Paul Yip, 2010. "Capture–Recapture Estimation Using Finite Mixtures of Arbitrary Dimension," Biometrics, The International Biometric Society, vol. 66(2), pages 644-655, June.
    6. O’Hagan, Adrian & Murphy, Thomas Brendan & Gormley, Isobel Claire, 2012. "Computational aspects of fitting mixture models via the expectation–maximization algorithm," Computational Statistics & Data Analysis, Elsevier, vol. 56(12), pages 3843-3864.
    7. Dunstan, Piers K. & Foster, Scott D. & Darnell, Ross, 2011. "Model based grouping of species across environmental gradients," Ecological Modelling, Elsevier, vol. 222(4), pages 955-963.
    8. Hua Zhou & Kenneth L. Lange, 2010. "On the Bumpy Road to the Dominant Mode," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 37(4), pages 612-631, December.
    9. G. J. McLachlan, 1987. "On Bootstrapping the Likelihood Ratio Test Statistic for the Number of Components in a Normal Mixture," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 36(3), pages 318-324, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Fernández, D. & Arnold, R. & Pledger, S., 2016. "Mixture-based clustering for the ordered stereotype model," Computational Statistics & Data Analysis, Elsevier, vol. 93(C), pages 46-75.
    2. Eleni Matechou & Ivy Liu & Daniel Fernández & Miguel Farias & Bergljot Gjelsvik, 2016. "Biclustering Models for Two-Mode Ordinal Data," Psychometrika, Springer;The Psychometric Society, vol. 81(3), pages 611-624, September.
    3. Scott D. Foster & Nicole A. Hill & Mitchell Lyons, 2017. "Ecological grouping of survey sites when sampling artefacts are present," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 66(5), pages 1031-1047, November.
    4. Daniel Fernández & Radim J. Sram & Miroslav Dostal & Anna Pastorkova & Hans Gmuender & Hyunok Choi, 2018. "Modeling Unobserved Heterogeneity in Susceptibility to Ambient Benzo[ a ]pyrene Concentration among Children with Allergic Asthma Using an Unsupervised Learning Algorithm," IJERPH, MDPI, vol. 15(1), pages 1-18, January.
    5. D. Fernández & S. Pledger, 2016. "Categorising Count Data into Ordinal Responses with Application to Ecological Communities," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 21(2), pages 348-362, June.
    6. Tatjana Miljkovic & Daniel Fernández, 2018. "On Two Mixture-Based Clustering Approaches Used in Modeling an Insurance Portfolio," Risks, MDPI, vol. 6(2), pages 1-18, May.
    7. Daniel Fernández & Richard Arnold & Shirley Pledger & Ivy Liu & Roy Costilla, 2019. "Finite mixture biclustering of discrete type multivariate data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(1), pages 117-143, March.
    8. M. P. B. Gallaugher & C. Biernacki & P. D. McNicholas, 2023. "Parameter-wise co-clustering for high-dimensional data," Computational Statistics, Springer, vol. 38(3), pages 1597-1619, September.
    9. Christian Carmona & Luis Nieto-Barajas & Antonio Canale, 2019. "Model-based approach for household clustering with mixed scale variables," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(2), pages 559-583, June.
    10. Jacques, Julien & Biernacki, Christophe, 2018. "Model-based co-clustering for ordinal data," Computational Statistics & Data Analysis, Elsevier, vol. 123(C), pages 101-115.
    11. Emilio Carrizosa & Vanesa Guerrero & Dolores Romero Morales, 2023. "On mathematical optimization for clustering categories in contingency tables," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(2), pages 407-429, June.
    12. Álvarez de Toledo, Pablo & Núñez, Fernando & Usabiaga, Carlos, 2018. "Matching and clustering in square contingency tables. Who matches with whom in the Spanish labour market," Computational Statistics & Data Analysis, Elsevier, vol. 127(C), pages 135-159.
    13. Hui, Francis K.C., 2017. "Model-based simultaneous clustering and ordination of multivariate abundance data in ecology," Computational Statistics & Data Analysis, Elsevier, vol. 105(C), pages 1-10.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fernández, D. & Arnold, R. & Pledger, S., 2016. "Mixture-based clustering for the ordered stereotype model," Computational Statistics & Data Analysis, Elsevier, vol. 93(C), pages 46-75.
    2. Daniel Fernández & Richard Arnold & Shirley Pledger & Ivy Liu & Roy Costilla, 2019. "Finite mixture biclustering of discrete type multivariate data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(1), pages 117-143, March.
    3. Adrian O’Hagan & Arthur White, 2019. "Improved model-based clustering performance using Bayesian initialization averaging," Computational Statistics, Springer, vol. 34(1), pages 201-231, March.
    4. O’Hagan, Adrian & Murphy, Thomas Brendan & Gormley, Isobel Claire, 2012. "Computational aspects of fitting mixture models via the expectation–maximization algorithm," Computational Statistics & Data Analysis, Elsevier, vol. 56(12), pages 3843-3864.
    5. Álvarez de Toledo, Pablo & Núñez, Fernando & Usabiaga, Carlos, 2018. "Matching and clustering in square contingency tables. Who matches with whom in the Spanish labour market," Computational Statistics & Data Analysis, Elsevier, vol. 127(C), pages 135-159.
    6. Roy Costilla & Ivy Liu & Richard Arnold & Daniel Fernández, 2019. "Bayesian model-based clustering for longitudinal ordinal data," Computational Statistics, Springer, vol. 34(3), pages 1015-1038, September.
    7. Bohning, Dankmar & Seidel, Wilfried, 2003. "Editorial: recent developments in mixture models," Computational Statistics & Data Analysis, Elsevier, vol. 41(3-4), pages 349-357, January.
    8. Daniel Fernández & Radim J. Sram & Miroslav Dostal & Anna Pastorkova & Hans Gmuender & Hyunok Choi, 2018. "Modeling Unobserved Heterogeneity in Susceptibility to Ambient Benzo[ a ]pyrene Concentration among Children with Allergic Asthma Using an Unsupervised Learning Algorithm," IJERPH, MDPI, vol. 15(1), pages 1-18, January.
    9. Isaia, A. Durio E.D., 2007. "A quick procedure for model selection in the case of mixture of normal densities," Computational Statistics & Data Analysis, Elsevier, vol. 51(12), pages 5635-5643, August.
    10. Dankmar Böhning & Ronny Kuhnert, 2006. "Equivalence of Truncated Count Mixture Distributions and Mixtures of Truncated Count Distributions," Biometrics, The International Biometric Society, vol. 62(4), pages 1207-1215, December.
    11. Fetene B. Tekle & Dereje W. Gudicha & Jeroen K. Vermunt, 2016. "Power analysis for the bootstrap likelihood ratio test for the number of classes in latent class models," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 10(2), pages 209-224, June.
    12. Paul S. F. Yip & Hua-Zhen Lin & Liqun Xi, 2005. "A Semiparametric Method for Estimating Population Size for Capture–Recapture Experiments with Random Covariates in Continuous Time," Biometrics, The International Biometric Society, vol. 61(4), pages 1085-1092, December.
    13. Chang Xuan Mao & Na You, 2009. "On Comparison of Mixture Models for Closed Population Capture–Recapture Studies," Biometrics, The International Biometric Society, vol. 65(2), pages 547-553, June.
    14. Rufo, M.J. & Pérez, C.J. & Martín, J., 2009. "Local parametric sensitivity for mixture models of lifetime distributions," Reliability Engineering and System Safety, Elsevier, vol. 94(7), pages 1238-1244.
    15. Xu, Wenjing & Pan, Qing & Gastwirth, Joseph L., 2014. "Cox proportional hazards models with frailty for negatively correlated employment processes," Computational Statistics & Data Analysis, Elsevier, vol. 70(C), pages 295-307.
    16. Ben C. Stevenson & Rachel M. Fewster & Koustubh Sharma, 2022. "Spatial correlation structures for detections of individuals in spatial capture–recapture models," Biometrics, The International Biometric Society, vol. 78(3), pages 963-973, September.
    17. Hajo Holzmann & Axel Munk & Walter Zucchini, 2006. "On Identifiability in Capture–Recapture Models," Biometrics, The International Biometric Society, vol. 62(3), pages 934-936, September.
    18. Yuan Liu & Hongyun Liu, 2019. "Effects of Distance and Shape on the Estimation of the Piecewise Growth Mixture Model," Journal of Classification, Springer;The Classification Society, vol. 36(3), pages 659-677, October.
    19. Park, Seong C. & Brorsen, B. Wade & Stoecker, Arthur L. & Hattey, Jeffory A., 2012. "Forage Response to Swine Effluent: A Cox Nonnested Test of Alternative Functional Forms Using a Fast Double Bootstrap," Journal of Agricultural and Applied Economics, Cambridge University Press, vol. 44(4), pages 593-606, November.
    20. Manisera, Marica & Zuccolotto, Paola, 2014. "Modeling rating data with Nonlinear CUB models," Computational Statistics & Data Analysis, Elsevier, vol. 78(C), pages 100-118.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:71:y:2014:i:c:p:241-261. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.