IDEAS home Printed from https://ideas.repec.org/p/upf/upfgen/1551.html
   My bibliography  Save this paper

Selection and statistical analysis of compositional ratios

Author

Abstract

Compositional data are nonnegative data with the property of closure: that is, each set of values on their components, or so-called parts, has a fixed sum, usually 1 or 100%. Compositional data cannot be analyzed by conventional statistical methods, since the value of any part depends on the choice of the other parts of the composition of interest. For example, reporting the mean and standard deviation of a specific part makes no sense, neither does the correlation between two parts. I propose that a small set of ratios of parts can be determined, either by expert choice or by automatic selection, which effectively replaces the compositional data set. This set can be determined to explain 100% of the variance in the compositional data, or as close to 100% as required. These part ratios can then be validly summarized and analyzed by conventional univariate methods, as well as multivariate methods, where the ratios are preferably log-transformed.

Suggested Citation

  • Michael Greenacre, 2016. "Selection and statistical analysis of compositional ratios," Economics Working Papers 1551, Department of Economics and Business, Universitat Pompeu Fabra.
  • Handle: RePEc:upf:upfgen:1551
    as

    Download full text from publisher

    File URL: https://econ-papers.upf.edu/papers/1551.pdf
    File Function: Whole Paper
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. John Aitchison & Michael Greenacre, 2002. "Biplots of compositional data," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 51(4), pages 375-392, October.
    2. Luc Wouters & Hinrich W. Göhlmann & Luc Bijnens & Stefan U. Kass & Geert Molenberghs & Paul J. Lewi, 2003. "Graphical Exploration of Gene Expression Data: A Comparative Study of Three Multivariate Methods," Biometrics, The International Biometric Society, vol. 59(4), pages 1131-1139, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Michael Greenacre & Paul Lewi, 2009. "Distributional Equivalence and Subcompositional Coherence in the Analysis of Compositional Data, Contingency Tables and Ratio-Scale Measurements," Journal of Classification, Springer;The Classification Society, vol. 26(1), pages 29-54, April.
    2. Yvonne Pittelkow & Susan R. Wilson, 2005. "Use of Principal Component Analysis and the GE-Biplot for the Graphical Exploration of Gene Expression Data," Biometrics, The International Biometric Society, vol. 61(2), pages 630-632, June.
    3. B. Baris Alkan & Afsin Sahin, 2011. "Measuring inequalities in the distribution of health workers by bi-plot approach: The case of Turkey," Journal of Economics and Behavioral Studies, AMH International, vol. 2(2), pages 57-66.
    4. Giovanni C. Porzio & Giancarlo Ragozini & Domenico Vistocco, 2008. "On the use of archetypes as benchmarks," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 24(5), pages 419-437, September.
    5. Javier Palarea-Albaladejo & Josep Martín-Fernández & Jesús Soto, 2012. "Dealing with Distances and Transformations for Fuzzy C-Means Clustering of Compositional Data," Journal of Classification, Springer;The Classification Society, vol. 29(2), pages 144-169, July.
    6. Michael Greenacre & Paul Lewi, 2005. "Distributional equivalence and subcompositional coherence in the analysis of contingency tables, ratio-scale measurements and compositional data," Economics Working Papers 908, Department of Economics and Business, Universitat Pompeu Fabra, revised Aug 2007.
    7. Anna Maria Fiori & Francesco Porro, 2023. "A compositional analysis of systemic risk in European financial institutions," Annals of Finance, Springer, vol. 19(3), pages 325-354, September.
    8. Germ`a Coenders & N'uria Arimany Serrat, 2023. "Accounting statement analysis at industry level. A gentle introduction to the compositional approach," Papers 2305.16842, arXiv.org, revised Sep 2024.
    9. Juan José Egozcue & Vera Pawlowsky-Glahn, 2019. "Rejoinder on: Compositional data: the sample space and its structure," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(3), pages 658-663, September.
    10. Marco Taussi & Caterina Gozzi & Orlando Vaselli & Jacopo Cabassi & Matia Menichini & Marco Doveri & Marco Romei & Alfredo Ferretti & Alma Gambioli & Barbara Nisi, 2022. "Contamination Assessment and Temporal Evolution of Nitrates in the Shallow Aquifer of the Metauro River Plain (Adriatic Sea, Italy) after Remediation Actions," IJERPH, MDPI, vol. 19(19), pages 1-24, September.
    11. Siham Zaaboubi & Lotfi Khiari & Salah Abdesselam & Jacques Gallichand & Fassil Kebede & Ghouati Kerrache, 2020. "Particle Size Imbalance Index from Compositional Analysis to Evaluate Cereal Sustainability for Arid Soils in Eastern Algeria," Agriculture, MDPI, vol. 10(7), pages 1-10, July.
    12. Gardner-Lubbe, Sugnet, 2016. "A triplot for multiclass classification visualisation," Computational Statistics & Data Analysis, Elsevier, vol. 94(C), pages 20-32.
    13. Michael Greenacre & Rafael Pardo, 2006. "Subset Correspondence Analysis," Sociological Methods & Research, , vol. 35(2), pages 193-218, November.
    14. Greenacre, Michael, 2009. "Power transformations in correspondence analysis," Computational Statistics & Data Analysis, Elsevier, vol. 53(8), pages 3107-3116, June.
    15. Michael Greenacre, 2009. "Contribution biplots," Economics Working Papers 1162, Department of Economics and Business, Universitat Pompeu Fabra, revised Jan 2011.
    16. Michael Greenacre, 2023. "The chi-square standardization, combined with Box-Cox transformation, is a valid alternative to transforming to logratios in compositional data analysis," Economics Working Papers 1857, Department of Economics and Business, Universitat Pompeu Fabra.
    17. Huiwen Wang & Liying Shangguan & Rong Guan & Lynne Billard, 2015. "Principal component analysis for compositional data vectors," Computational Statistics, Springer, vol. 30(4), pages 1079-1096, December.
    18. Michael Greenacre, 2006. "Tying up the loose ends in simple correspondence analysis," Economics Working Papers 940, Department of Economics and Business, Universitat Pompeu Fabra.
    19. repec:jss:jstsof:13:i05 is not listed on IDEAS
    20. Jan Skála & Radim Vácha & Pavel Čupr, 2018. "Which Compounds Contribute Most to Elevated Soil Pollution and the Corresponding Health Risks in Floodplains in the Headwater Areas of the Central European Watershed?," IJERPH, MDPI, vol. 15(6), pages 1-16, June.
    21. Maria Anna Di Palma & Michele Gallo, 2019. "External Information Model in a Compositional Perspective: Evaluation of Campania Adolescents’ Preferences in the Allocation of Leisure-Time," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 146(1), pages 117-133, November.

    More about this item

    Keywords

    compositional data; logarithmic transformation; log-ratio analysis; multivariate analysis; ratios; univariate statistics.;
    All these keywords.

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:upf:upfgen:1551. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: http://www.econ.upf.edu/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.