A Simple Permutation Test for Clusteredness

My bibliography Save this paper

A Simple Permutation Test for Clusteredness

Author

Listed:

Michael Greenacre

Registered:

Michael John Greenacre

Abstract

Hierarchical clustering is a popular method for finding structure in multivariate data, resulting in a binary tree constructed on the particular objects of the study, usually sampling units. The user faces the decision where to cut the binary tree in order to determine the number of clusters to interpret and there are various ad hoc rules for arriving at a decision. A simple permutation test is presented that diagnoses whether non-random levels of clustering are present in the set of objects and, if so, indicates the specific level at which the tree can be cut. The test is validated against random matrices to verify the type I error probability and a power study is performed on data sets with known clusteredness to study the type II error.

Suggested Citation

Michael Greenacre, 2011. "A Simple Permutation Test for Clusteredness," Working Papers 555, Barcelona School of Economics.

Handle: RePEc:bge:wpaper:555

Download full text from publisher

Other versions of this item:

Michael Greenacre, 2011. "A simple permutation test for clusteredness," Economics Working Papers 1271, Department of Economics and Business, Universitat Pompeu Fabra.

References listed on IDEAS

Michael Greenacre, 2008. "Correspondence analysis of raw data," Economics Working Papers 1112, Department of Economics and Business, Universitat Pompeu Fabra, revised Jul 2009.
Gordon, A. D., 1994. "Identifying genuine clusters in a classification," Computational Statistics & Data Analysis, Elsevier, vol. 18(5), pages 561-581, December.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Christian Haedo & Michel Mouchart, 2022. "Two-mode clustering through profiles of regions and sectors," Empirical Economics, Springer, vol. 63(4), pages 1971-1996, October.
Lucie Aulus-Giacosa & Sébastien Ollier & Cleo Bertelsmeier, 2024. "Non-native ants are breaking down biogeographic boundaries and homogenizing community assemblages," Nature Communications, Nature, vol. 15(1), pages 1-11, December.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Eric Beh & Luigi D’Ambra, 2009. "Some Interpretative Tools for Non-Symmetrical Correspondence Analysis," Journal of Classification, Springer;The Classification Society, vol. 26(1), pages 55-76, April.
Pilar García Gómez & Ángel López Nicolás, 2005. "Socio-economic inequalities in health in Catalonia," Hacienda Pública Española / Review of Public Economics, IEF, vol. 175(4), pages 103-121, december.
- Pilar García Gómez & Ángel López, 2004. "Socio-economic inequalities in health in Catalonia," Working Papers, Research Center on Health and Economics 758, Department of Economics and Business, Universitat Pompeu Fabra, revised Oct 2005.
- Pilar García Gómez & Ángel López, 2004. "Socio-economic inequalities in health in Catalonia," Economics Working Papers 758, Department of Economics and Business, Universitat Pompeu Fabra, revised Oct 2005.
David Bholat & Stephen Hans & Pedro Santos & Cheryl Schonhardt-Bailey, 2015. "Text mining for central banks," Handbooks, Centre for Central Banking Studies, Bank of England, number 33, April.
Michael Greenacre, 2012. "Fuzzy coding in constrained ordinations," Economics Working Papers 1325, Department of Economics and Business, Universitat Pompeu Fabra.
- Michael Greenacre, 2012. "Fuzzy Coding in Constrained Ordinations," Working Papers 640, Barcelona School of Economics.
Rémi Bazillier & Nicolas Sirven, 2006. "Les normes fondamentales du travail contribuent-elles à réduire les inégalités ?," Revue Française d'Économie, Programme National Persée, vol. 21(2), pages 111-146.
- Rémi Bazillier & Nicolas Sirven, 2006. "Les normes fondamentales du travail contribuent-elles à réduire les inégalités ?," Documents de travail 123, Groupe d'Economie du Développement de l'Université Montesquieu Bordeaux IV.
- Rémi Bazillier & Nicolas Sirven, 2006. "Les normes fondamentales du travail contribuent-elles à réduire les inégalités ?," Post-Print halshs-00112316, HAL.
- Rémi Bazillier & Nicolas Sirven, 2006. "Les normes fondamentales du travail contribuent-elles à réduire les inégalités ?," Cahiers de la Maison des Sciences Economiques bla06016, Université Panthéon-Sorbonne (Paris 1).
- Rémi Bazillier & Nicolas Sirven, 2006. "Les normes fondamentales du travail contribuent-elles à réduire les inégalités ?," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) halshs-00112316, HAL.
Alfonso Gambardella & Walter Garcia Fontes, 1996. "European research funding and regional technological capabilities: Network composition analysis," Economics Working Papers 174, Department of Economics and Business, Universitat Pompeu Fabra.
Paul Green & Jonathan Kim & Frank Carmone, 1990. "A preliminary study of optimal variable weighting in k-means clustering," Journal of Classification, Springer;The Classification Society, vol. 7(2), pages 271-285, September.
Michael J. Greenacre & Patrick J. F. Groenen, 2016. "Weighted Euclidean Biplots," Journal of Classification, Springer;The Classification Society, vol. 33(3), pages 442-459, October.
- Michael Greenacre & Patrick J. F. Groenen, 2013. "Weighted Euclidean biplots," Economics Working Papers 1380, Department of Economics and Business, Universitat Pompeu Fabra.
- Michael Greenacre & Patrick J.F. Groenen, 2013. "Weighted Euclidean Biplots," Working Papers 708, Barcelona School of Economics.
Malcolm Dow & Peter Willett & Roderick McDonald & Belver Griffith & Michael Greenacre & Peter Bryant & Daniel Wartenberg & Ove Frank, 1987. "Book reviews," Journal of Classification, Springer;The Classification Society, vol. 4(2), pages 245-278, September.
Vartan Choulakian, 1988. "Exploratory analysis of contingency tables by loglinear formulation and generalizations of correspondence analysis," Psychometrika, Springer;The Psychometric Society, vol. 53(2), pages 235-250, June.
W. Krzanowski & Gregory Cermak & Jan Leeuw & Fionn Murtagh & Peter Bryant & Bernard Monjardet & Chikio Hayashi, 1985. "Book reviews," Journal of Classification, Springer;The Classification Society, vol. 2(1), pages 277-299, December.
François Bavaud, 2011. "On the Schoenberg Transformations in Data Analysis: Theory and Illustrations," Journal of Classification, Springer;The Classification Society, vol. 28(3), pages 297-314, October.
Maura Vásquez & Guillermo Ramírez & Alberto Camardiel & Tomás Aluja, 2008. "A Biplot graphical tool to model the relationships between two sets of variables," Economía, Instituto de Investigaciones Económicas y Sociales (IIES). Facultad de Ciencias Económicas y Sociales. Universidad de Los Andes. Mérida, Venezuela, vol. 33(25), pages 117-130, january-j.
Jurlin, Kresimir & Malekovic, Sanja & Puljiz, Jaksa & Cziraky, Dario & Polic, Mario, 2002. "Covariance structure analysis of regional development data: an application to municipality development assessment," ERSA conference papers ersa02p469, European Regional Science Association.
Robert Boik, 1996. "An efficient algorithm for joint correspondence analysis," Psychometrika, Springer;The Psychometric Society, vol. 61(2), pages 255-269, June.
Jos Berge, 1995. "Review," Psychometrika, Springer;The Psychometric Society, vol. 60(2), pages 313-315, June.
Evert Meijers, 2005. "High-level consumer services in polycentric urban regions - hospital care and higher education between duplication and complementarity," ERSA conference papers ersa05p208, European Regional Science Association.
Laurent Lesnard & Thibaut Saint Pol, 2009. "Patterns of Workweek Schedules in France," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 93(1), pages 171-176, August.
Warrens, Matthijs J. & Heiser, Willem J., 2009. "Diagnostics for regression dependence in tables re-ordered by the dominant correspondence analysis solution," Computational Statistics & Data Analysis, Elsevier, vol. 53(8), pages 3139-3144, June.
Nappi-Choulet, Ingrid & Décamps, Aurélien, 2011. "Is Sustainability Attractive for Corporate Real Estate Decisions ?," ESSEC Working Papers WP1106, ESSEC Research Center, ESSEC Business School.
- Ingrid Nappi-Choulet & Aurélien Décamps, 2011. "Is Sustainability Attractive for Corporate Real Estate Decisions ?," Post-Print hal-00609149, HAL.
- Ingrid Nappi-Choulet & Aurélien Décamps, 2011. "Is Sustainability Attractive for Corporate Real Estate Decisions?," ERES eres2011_40, European Real Estate Society (ERES).

More about this item

Keywords

Hierarchical clustering; Distance; permutation test;
All these keywords.

JEL classification:

C19 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Other
C88 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Other Computer Software

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bge:wpaper:555. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Bruno Guallar (email available below). General contact details of provider: https://edirc.repec.org/data/bargses.html .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

A Simple Permutation Test for Clusteredness

Author

Abstract

Suggested Citation

Download full text from publisher

Other versions of this item:

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

JEL classification:

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data