IDEAS home Printed from https://ideas.repec.org/p/clb/wpaper/201209.html
   My bibliography  Save this paper

Weighting Distance Matrices Using Rank Correlations

Author

Listed:
  • Ilaria Lucrezia Amerise
  • Agostino Tarsitano

    (Dipartimento di Economia e Statistica, Università della Calabria)

Abstract

In a number of applications of multivariate analysis, the data matrix is not fully observed. Instead a set of distance matrices on the same entities is available. A reasonable strategy to construct a global distance matrix is to compute a weighted average of the partial distance matrices, provided that an appropriate system of weights can be defined. The Distatis method developed by Abdi et al. (2005) is a three-step procedure for computing the global distance matrix. An important aspect of that procedure is the computation of the vector correlation coefficient (RV) to measure the similarity between partial distance matrices. The RV coefficient is based on the Pearson product moment correlation coeffcient, which is highly prone to the effects of outliers. We are convinced that, in many measurable phenomena, the relationships between distances are far more likely to be ordinal than interval in nature, and it is therefore preferable to adopt an approach appropriate to ordinal data. The goal of our paper is to revise the system of weights of the Distatis procedure substituting the conventional Pearson coefficient with rank correlations that are less affected by errors of measurement, perturbation or presence of outliers in the data. In the light of our findings on real and simulated data sets, we recommend the use of a speci c coefficient of rank correlation to replace, where necessary, the conventional vector correlation.

Suggested Citation

  • Ilaria Lucrezia Amerise & Agostino Tarsitano, 2012. "Weighting Distance Matrices Using Rank Correlations," Working Papers 201209, Università della Calabria, Dipartimento di Economia, Statistica e Finanza "Giovanni Anania" - DESF.
  • Handle: RePEc:clb:wpaper:201209
    as

    Download full text from publisher

    File URL: http://www.ecostat.unical.it/RePEc/WorkingPapers/WP09_2012.pdf
    File Function: First version, 2012-12
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Francis Cailliez, 1983. "The analytical solution of the additive constant problem," Psychometrika, Springer;The Psychometric Society, vol. 48(2), pages 305-308, June.
    2. Jesper W. Schneider & Pia Borlund, 2007. "Matrix comparison, Part 1: Motivation and important issues for measuring the resemblance between proximity measures or ordination results," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 58(11), pages 1586-1595, September.
    3. Mayer Claus-Dieter & Lorent Julie & Horgan Graham W, 2011. "Exploratory Analysis of Multiple Omics Datasets Using the Adjusted RV Coefficient," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-27, March.
    4. Vladimir Batagelj & Matevz Bren, 1995. "Comparing resemblance measures," Journal of Classification, Springer;The Classification Society, vol. 12(1), pages 73-90, March.
    5. Véronique Campbell & Pierre Legendre & François-Joseph Lapointe, 2009. "Assessing Congruence Among Ultrametric Distance Matrices," Journal of Classification, Springer;The Classification Society, vol. 26(1), pages 103-117, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Matthijs Warrens, 2008. "Bounds of Resemblance Measures for Binary (Presence/Absence) Variables," Journal of Classification, Springer;The Classification Society, vol. 25(2), pages 195-208, November.
    2. van Eck, N.J.P. & Waltman, L., 2007. "Appropriate Similarity Measures for Author Cocitation Analysis," ERIM Report Series Research in Management ERS-2007-091-LIS, Erasmus Research Institute of Management (ERIM), ERIM is the joint research institute of the Rotterdam School of Management, Erasmus University and the Erasmus School of Economics (ESE) at Erasmus University Rotterdam.
    3. Wildgaard, Lorna, 2016. "A critical cluster analysis of 44 indicators of author-level performance," Journal of Informetrics, Elsevier, vol. 10(4), pages 1055-1078.
    4. J. Fernando Vera & Rodrigo Macías, 2021. "On the Behaviour of K-Means Clustering of a Dissimilarity Matrix by Means of Full Multidimensional Scaling," Psychometrika, Springer;The Psychometric Society, vol. 86(2), pages 489-513, June.
    5. Ricotta, Carlo & Szeidl, Laszlo, 2009. "Diversity partitioning of Rao’s quadratic entropy," Theoretical Population Biology, Elsevier, vol. 76(4), pages 299-302.
    6. Matthijs Warrens, 2008. "On the Indeterminacy of Resemblance Measures for Binary (Presence/Absence) Data," Journal of Classification, Springer;The Classification Society, vol. 25(1), pages 125-136, June.
    7. Panpan Yu & Qingna Li, 2018. "Ordinal Distance Metric Learning with MDS for Image Ranking," Asia-Pacific Journal of Operational Research (APJOR), World Scientific Publishing Co. Pte. Ltd., vol. 35(01), pages 1-19, February.
    8. Fionn Murtagh, 2009. "The Remarkable Simplicity of Very High Dimensional Data: Application of Model-Based Clustering," Journal of Classification, Springer;The Classification Society, vol. 26(3), pages 249-277, December.
    9. Cornelius Fritz & Göran Kauermann, 2022. "On the interplay of regional mobility, social connectedness and the spread of COVID‐19 in Germany," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(1), pages 400-424, January.
    10. Papageorgiou, Ioulia & Moustaki, Irini, 2019. "Sampling of pairs in pairwise likelihood estimation for latent variable models with categorical observed variables," LSE Research Online Documents on Economics 87592, London School of Economics and Political Science, LSE Library.
    11. Martin G. Moehrle, 2010. "Measures for textual patent similarities: a guided way to select appropriate approaches," Scientometrics, Springer;Akadémiai Kiadó, vol. 85(1), pages 95-109, October.
    12. Matthijs Warrens, 2008. "On the Equivalence of Cohen’s Kappa and the Hubert-Arabie Adjusted Rand Index," Journal of Classification, Springer;The Classification Society, vol. 25(2), pages 177-183, November.
    13. Irène Gijbels & Marek Omelka, 2013. "Testing for Homogeneity of Multivariate Dispersions Using Dissimilarity Measures," Biometrics, The International Biometric Society, vol. 69(1), pages 137-145, March.
    14. Oscar Lao & Fan Liu & Andreas Wollstein & Manfred Kayser, 2014. "GAGA: A New Algorithm for Genomic Inference of Geographic Ancestry Reveals Fine Level Population Substructure in Europeans," PLOS Computational Biology, Public Library of Science, vol. 10(2), pages 1-11, February.
    15. Carlo Cavicchia & Maurizio Vichi & Giorgia Zaccaria, 2022. "Gaussian mixture model with an extended ultrametric covariance structure," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 16(2), pages 399-427, June.
    16. Yunda Wang & Qiguan Shu & Ming Chen & Xudounan Chen & Shiro Takeda & Junhua Zhang, 2022. "Selection and Application of Quantitative Indicators of Paths Based on Graph Theory: A Case Study of Traditional Private and Antique Gardens in Beijing," Land, MDPI, vol. 11(12), pages 1-21, December.
    17. Matthijs Warrens, 2009. "On Robinsonian dissimilarities, the consecutive ones property and latent variable models," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 3(2), pages 169-184, September.
    18. Law, Kris M.Y. & Breznik, Kristijan, 2018. "What do airline mission statements reveal about value and strategy?," Journal of Air Transport Management, Elsevier, vol. 70(C), pages 36-44.
    19. Copiello, Sergio, 2019. "Peer and neighborhood effects: Citation analysis using a spatial autoregressive model and pseudo-spatial data," Journal of Informetrics, Elsevier, vol. 13(1), pages 238-254.
    20. Nataliya Matveeva & Vladimir Batagelj & Anuška Ferligoj, 2023. "Scientific collaboration of post-Soviet countries: the effects of different network normalizations," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(8), pages 4219-4242, August.

    More about this item

    Keywords

    Distatis; Ordinal data; Vector rank correlation;
    All these keywords.

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:clb:wpaper:201209. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Giovanni Dodero (email available below). General contact details of provider: https://edirc.repec.org/data/decalit.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.