A Simple Permutation Test for Clusteredness
AbstractHierarchical clustering is a popular method for finding structure in multivariate data, resulting in a binary tree constructed on the particular objects of the study, usually sampling units. The user faces the decision where to cut the binary tree in order to determine the number of clusters to interpret and there are various ad hoc rules for arriving at a decision. A simple permutation test is presented that diagnoses whether non-random levels of clustering are present in the set of objects and, if so, indicates the specific level at which the tree can be cut. The test is validated against random matrices to verify the type I error probability and a power study is performed on data sets with known clusteredness to study the type II error.
Download InfoIf you experience problems downloading a file, check if you have the proper application to view it first. In case of further problems read the IDEAS help page. Note that these files are not on the IDEAS site. Please be patient as the files may be large.
Bibliographic InfoPaper provided by Barcelona Graduate School of Economics in its series Working Papers with number 555.
Date of creation: Apr 2011
Date of revision:
Hierarchical clustering; distance; permutation test;
Other versions of this item:
- C19 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Other
- C88 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Other Computer Software
Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:
- Michael Greenacre, 2008. "Correspondence analysis of raw data," Economics Working Papers 1112, Department of Economics and Business, Universitat Pompeu Fabra, revised Jul 2009.
- Gordon, A. D., 1994. "Identifying genuine clusters in a classification," Computational Statistics & Data Analysis, Elsevier, vol. 18(5), pages 561-581, December.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Bruno Guallar).
If references are entirely missing, you can add them using this form.