A Simple Permutation Test for Clusteredness

A Simple Permutation Test for Clusteredness

Author

Listed:

Michael Greenacre

Registered:

Michael John Greenacre

Abstract

Hierarchical clustering is a popular method for finding structure in multivariate data, resulting in a binary tree constructed on the particular objects of the study, usually sampling units. The user faces the decision where to cut the binary tree in order to determine the number of clusters to interpret and there are various ad hoc rules for arriving at a decision. A simple permutation test is presented that diagnoses whether non-random levels of clustering are present in the set of objects and, if so, indicates the specific level at which the tree can be cut. The test is validated against random matrices to verify the type I error probability and a power study is performed on data sets with known clusteredness to study the type II error.

Suggested Citation

Michael Greenacre, 2015. "A Simple Permutation Test for Clusteredness," Working Papers 555, Barcelona School of Economics.

Handle: RePEc:bge:wpaper:555

Download full text from publisher

Other versions of this item:

Michael Greenacre, 2011. "A simple permutation test for clusteredness," Economics Working Papers 1271, Department of Economics and Business, Universitat Pompeu Fabra.

References listed on IDEAS

Michael Greenacre, 2008. "Correspondence analysis of raw data," Economics Working Papers 1112, Department of Economics and Business, Universitat Pompeu Fabra, revised Jul 2009.
Gordon, A. D., 1994. "Identifying genuine clusters in a classification," Computational Statistics & Data Analysis, Elsevier, vol. 18(5), pages 561-581, December.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Christian Haedo & Michel Mouchart, 2022. "Two-mode clustering through profiles of regions and sectors," Empirical Economics, Springer, vol. 63(4), pages 1971-1996, October.
Lucie Aulus-Giacosa & Sébastien Ollier & Cleo Bertelsmeier, 2024. "Non-native ants are breaking down biogeographic boundaries and homogenizing community assemblages," Nature Communications, Nature, vol. 15(1), pages 1-11, December.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Jacqueline Meulman, 1996. "Fitting a distance model to homogeneous subsets of variables: Points of view analysis of categorical data," Journal of Classification, Springer;The Classification Society, vol. 13(2), pages 249-266, September.
Eric Beh & Luigi D’Ambra, 2009. "Some Interpretative Tools for Non-Symmetrical Correspondence Analysis," Journal of Classification, Springer;The Classification Society, vol. 26(1), pages 55-76, April.
Mulquin, Marie-Eve & Siaens, Corinne & Wodon, Quentin, 1998. "Les restaurants du coeur : pour qui et pourquoi ? [Food Aid for the Poor or Social Support? Case Study on a Belgian Social Restaurant]," MPRA Paper 10504, University Library of Munich, Germany.
Rosaria Lombardo & Jacqueline Meulman, 2010. "Multiple Correspondence Analysis via Polynomial Transformations of Ordered Categorical Variables," Journal of Classification, Springer;The Classification Society, vol. 27(2), pages 191-210, September.
David Bholat & Stephen Hans & Pedro Santos & Cheryl Schonhardt-Bailey, 2015. "Text mining for central banks," Handbooks, Centre for Central Banking Studies, Bank of England, number 33, April.
Michael Greenacre, 2012. "Fuzzy coding in constrained ordinations," Economics Working Papers 1325, Department of Economics and Business, Universitat Pompeu Fabra.
- Michael Greenacre, 2015. "Fuzzy Coding in Constrained Ordinations," Working Papers 640, Barcelona School of Economics.
Shizuhiko Nishisato, 1996. "Reviews," Psychometrika, Springer;The Psychometric Society, vol. 61(2), pages 391-393, June.
Harvey Goldstein, 1987. "The choice of constraints in correspondence analysis," Psychometrika, Springer;The Psychometric Society, vol. 52(2), pages 207-215, June.
Alfonso Gambardella & Walter Garcia Fontes, 1996. "European research funding and regional technological capabilities: Network composition analysis," Economics Working Papers 174, Department of Economics and Business, Universitat Pompeu Fabra.
Antoine Falguerolles & Said Jmel & Joe Whittaker, 1995. "Correspondence analysis and association models constrained by a conditional independence graph," Psychometrika, Springer;The Psychometric Society, vol. 60(2), pages 161-180, June.
Paul Green & Jonathan Kim & Frank Carmone, 1990. "A preliminary study of optimal variable weighting in k-means clustering," Journal of Classification, Springer;The Classification Society, vol. 7(2), pages 271-285, September.
Ruben Konig, 2010. "Changing social categories in a changing society: studying trends with correspondence analysis," Quality & Quantity: International Journal of Methodology, Springer, vol. 44(3), pages 409-425, April.
Michael Greenacre & Shizuhiko Nishisato, 1996. "Reviews," Psychometrika, Springer;The Psychometric Society, vol. 61(1), pages 177-190, March.
John Lennon & Michael J. Keane, 2006. "Delineating Daily Activity Spaces in Rural Areas," Working Papers 0617, Rural Economy and Development Programme,Teagasc.
Peter Heijden & Jan Leeuw, 1985. "Correspondence analysis used complementary to loglinear analysis," Psychometrika, Springer;The Psychometric Society, vol. 50(4), pages 429-447, December.
Ganiere, Pierre & Chern, Wen S. & Hahn, David E. & Chiang, Fu-Sung, 2004. "Consumer Attitudes towards Genetically Modified Foods in Emerging Markets: The Impact of Labeling in Taiwan," International Food and Agribusiness Management Review, International Food and Agribusiness Management Association, vol. 7(3), pages 1-20.
Michael J. Greenacre & Patrick J. F. Groenen, 2016. "Weighted Euclidean Biplots," Journal of Classification, Springer;The Classification Society, vol. 33(3), pages 442-459, October.
- Michael Greenacre & Patrick J. F. Groenen, 2013. "Weighted Euclidean biplots," Economics Working Papers 1380, Department of Economics and Business, Universitat Pompeu Fabra.
- Patrick J.F. & Michael Greenacre, 2015. "Weighted Euclidean Biplots," Working Papers 708, Barcelona School of Economics.
Padmini Desikachar & Brinda Viswanathan, 2012. "Patterns of Labour Market Insecurity in Rural India: A Multidimensional and Multivariate Analysis," Working Papers id:4901, eSocialSciences.
Francis Munier, 2006. "Firm size, technological intensity of sector and relational competencies to innovate: Evidence from French industrial innovating firms," Economics of Innovation and New Technology, Taylor & Francis Journals, vol. 15(4-5), pages 493-505.
Julie Josse & Marie Chavent & Benot Liquet & François Husson, 2012. "Handling Missing Values with Regularized Iterative Multiple Correspondence Analysis," Journal of Classification, Springer;The Classification Society, vol. 29(1), pages 91-116, April.

More about this item

Keywords

; ; ;

JEL classification:

C19 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Other
C88 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Other Computer Software

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bge:wpaper:555. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Bruno Guallar (email available below). General contact details of provider: https://edirc.repec.org/data/bargses.html .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

A Simple Permutation Test for Clusteredness

Author

Abstract

Suggested Citation

Download full text from publisher

Other versions of this item:

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

JEL classification:

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data