Weighting Distance Matrices Using Rank Correlations
AbstractIn a number of applications of multivariate analysis, the data matrix is not fully observed. Instead a set of distance matrices on the same entities is available. A reasonable strategy to construct a global distance matrix is to compute a weighted average of the partial distance matrices, provided that an appropriate system of weights can be defined. The Distatis method developed by Abdi et al. (2005) is a three-step procedure for computing the global distance matrix. An important aspect of that procedure is the computation of the vector correlation coefficient (RV) to measure the similarity between partial distance matrices. The RV coefficient is based on the Pearson product moment correlation coeffcient, which is highly prone to the effects of outliers. We are convinced that, in many measurable phenomena, the relationships between distances are far more likely to be ordinal than interval in nature, and it is therefore preferable to adopt an approach appropriate to ordinal data. The goal of our paper is to revise the system of weights of the Distatis procedure substituting the conventional Pearson coefficient with rank correlations that are less affected by errors of measurement, perturbation or presence of outliers in the data. In the light of our findings on real and simulated data sets, we recommend the use of a speci c coefficient of rank correlation to replace, where necessary, the conventional vector correlation.
Download InfoIf you experience problems downloading a file, check if you have the proper application to view it first. In case of further problems read the IDEAS help page. Note that these files are not on the IDEAS site. Please be patient as the files may be large.
Bibliographic InfoPaper provided by Università della Calabria, Dipartimento di Economia, Statistica e Finanza (Ex Dipartimento di Economia e Statistica) in its series Working Papers with number 201209.
Length: 19 pages
Date of creation: Dec 2012
Date of revision:
Contact details of provider:
Postal: Università della Calabria, Dipartimento di Economia, Statistica e Finanza, Ponte Pietro Bucci, Cubo 0/C, I-87036 Arcavacata di Rende, CS, Italy
Phone: +39 0984 492413
Fax: +39 0984 492421
Web page: http://www.unical.it/portale/strutture/dipartimenti_240/disesf/
More information through EDIRC
Distatis; Ordinal data; Vector rank correlation;
This paper has been announced in the following NEP Reports:
Please report citation or reference errors to , or , if you are the registered author of the cited work, log in to your RePEc Author Service profile, click on "citations" and make appropriate adjustments.:
- Francis Cailliez, 1983. "The analytical solution of the additive constant problem," Psychometrika, Springer, vol. 48(2), pages 305-308, June.
- Vladimir Batagelj & Matevz Bren, 1995. "Comparing resemblance measures," Journal of Classification, Springer, vol. 12(1), pages 73-90, March.
- Véronique Campbell & Pierre Legendre & François-Joseph Lapointe, 2009. "Assessing Congruence Among Ultrametric Distance Matrices," Journal of Classification, Springer, vol. 26(1), pages 103-117, April.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Giovanni Dodero).
If references are entirely missing, you can add them using this form.