IDEAS home Printed from https://ideas.repec.org/p/upf/upfgen/908.html
   My bibliography  Save this paper

Distributional equivalence and subcompositional coherence in the analysis of contingency tables, ratio-scale measurements and compositional data

Author

Abstract

We consider two fundamental properties in the analysis of two-way tables of positive data: the principle of distributional equivalence, one of the cornerstones of correspondence analysis of contingency tables, and the principle of subcompositional coherence, which forms the basis of compositional data analysis. For an analysis to be subcompositionally coherent, it suffices to analyse the ratios of the data values. The usual approach to dimension reduction in compositional data analysis is to perform principal component analysis on the logarithms of ratios, but this method does not obey the principle of distributional equivalence. We show that by introducing weights for the rows and columns, the method achieves this desirable property. This weighted log-ratio analysis is theoretically equivalent to “spectral mapping”, a multivariate method developed almost 30 years ago for displaying ratio-scale data from biological activity spectra. The close relationship between spectral mapping and correspondence analysis is also explained, as well as their connection with association modelling. The weighted log-ratio methodology is applied here to frequency data in linguistics and to chemical compositional data in archaeology.

Suggested Citation

  • Michael Greenacre & Paul Lewi, 2005. "Distributional equivalence and subcompositional coherence in the analysis of contingency tables, ratio-scale measurements and compositional data," Economics Working Papers 908, Department of Economics and Business, Universitat Pompeu Fabra, revised Aug 2007.
  • Handle: RePEc:upf:upfgen:908
    as

    Download full text from publisher

    File URL: https://econ-papers.upf.edu/papers/908.pdf
    File Function: Whole Paper
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. K. Ruben Gabriel, 2002. "Goodness of fit of biplots and correspondence analysis," Biometrika, Biometrika Trust, vol. 89(2), pages 423-436, June.
    2. John Aitchison & Michael Greenacre, 2002. "Biplots of compositional data," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 51(4), pages 375-392, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Michael Greenacre & Paul Lewi, 2009. "Distributional Equivalence and Subcompositional Coherence in the Analysis of Compositional Data, Contingency Tables and Ratio-Scale Measurements," Journal of Classification, Springer;The Classification Society, vol. 26(1), pages 29-54, April.
    2. Michael Greenacre, 2006. "Tying up the loose ends in simple correspondence analysis," Economics Working Papers 940, Department of Economics and Business, Universitat Pompeu Fabra.
    3. B. Baris Alkan & Afsin Sahin, 2011. "Measuring inequalities in the distribution of health workers by bi-plot approach: The case of Turkey," Journal of Economics and Behavioral Studies, AMH International, vol. 2(2), pages 57-66.
    4. Javier Palarea-Albaladejo & Josep Martín-Fernández & Jesús Soto, 2012. "Dealing with Distances and Transformations for Fuzzy C-Means Clustering of Compositional Data," Journal of Classification, Springer;The Classification Society, vol. 29(2), pages 144-169, July.
    5. Anna Maria Fiori & Francesco Porro, 2023. "A compositional analysis of systemic risk in European financial institutions," Annals of Finance, Springer, vol. 19(3), pages 325-354, September.
    6. Germ`a Coenders & N'uria Arimany Serrat, 2023. "Accounting statement analysis at industry level. A gentle introduction to the compositional approach," Papers 2305.16842, arXiv.org, revised Feb 2024.
    7. Greenacre, Michael, 2009. "Power transformations in correspondence analysis," Computational Statistics & Data Analysis, Elsevier, vol. 53(8), pages 3107-3116, June.
    8. Michael Greenacre, 2009. "Contribution biplots," Economics Working Papers 1162, Department of Economics and Business, Universitat Pompeu Fabra, revised Jan 2011.
    9. Michael Greenacre, 2023. "The chi-square standardization, combined with Box-Cox transformation, is a valid alternative to transforming to logratios in compositional data analysis," Economics Working Papers 1857, Department of Economics and Business, Universitat Pompeu Fabra.
    10. Huiwen Wang & Liying Shangguan & Rong Guan & Lynne Billard, 2015. "Principal component analysis for compositional data vectors," Computational Statistics, Springer, vol. 30(4), pages 1079-1096, December.
    11. repec:jss:jstsof:13:i05 is not listed on IDEAS
    12. Jan Skála & Radim Vácha & Pavel Čupr, 2018. "Which Compounds Contribute Most to Elevated Soil Pollution and the Corresponding Health Risks in Floodplains in the Headwater Areas of the Central European Watershed?," IJERPH, MDPI, vol. 15(6), pages 1-16, June.
    13. Maria Anna Di Palma & Michele Gallo, 2019. "External Information Model in a Compositional Perspective: Evaluation of Campania Adolescents’ Preferences in the Allocation of Leisure-Time," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 146(1), pages 117-133, November.
    14. Cuadras, Carles M. & Greenacre, Michael, 2022. "A short history of statistical association: From correlation to correspondence analysis to copulas," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    15. Karel Hron & Paula Brito & Peter Filzmoser, 2017. "Exploratory data analysis for interval compositional data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 11(2), pages 223-241, June.
    16. Alvis Cabrera & Lyvia Biagi & Aleix Beneyto & Ernesto Estremera & Iván Contreras & Marga Giménez & Ignacio Conget & Jorge Bondia & Josep Antoni Martín-Fernández & Josep Vehí, 2023. "Validation of a Probabilistic Prediction Model for Patients with Type 1 Diabetes Using Compositional Data Analysis," Mathematics, MDPI, vol. 11(5), pages 1-17, March.
    17. Thinh Nguyen Van & Akinori Ozaki & Hoang Nguyen Tho & Anh Nguyen Duc & Yen Tran Thi & Kiyoshi Kurosawa, 2016. "Arsenic and Heavy Metal Contamination in Soils under Different Land Use in an Estuary in Northern Vietnam," IJERPH, MDPI, vol. 13(11), pages 1-13, November.
    18. Rozkrut Dominik, 2014. "Measuring Eco-Innovation: Towards Better Policies to Support Green Growth," Folia Oeconomica Stetinensia, Sciendo, vol. 14(1), pages 1-12, June.
    19. Jan Graffelman, 2005. "Enriched biplots for canonical correlation analysis," Journal of Applied Statistics, Taylor & Francis Journals, vol. 32(2), pages 173-188.
    20. Sergio Scippacercola & Zaccaria Petrillo & Annarita Mangiacapra & Stefano Caliro, 2019. "Multivariate approach to evaluate the relationship among geophysical and geochemical variables during an unrest period at Campi Flegrei caldera (Italy)," Quality & Quantity: International Journal of Methodology, Springer, vol. 53(5), pages 2473-2489, September.
    21. Pittelkow Yvonne E & Wilson Susan R, 2003. "Visualisation of Gene Expression Data - the GE-biplot, the Chip-plot and the Gene-plot," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 2(1), pages 1-19, September.

    More about this item

    Keywords

    Association models; biplot; compositional data; contingency tables; correspondence analysis; distributional equivalence; log-ration transformation; ratio-scale data; singular value decomposition;
    All these keywords.

    JEL classification:

    • C19 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Other
    • C88 - Mathematical and Quantitative Methods - - Data Collection and Data Estimation Methodology; Computer Programs - - - Other Computer Software

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:upf:upfgen:908. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: http://www.econ.upf.edu/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.