IDEAS home Printed from https://ideas.repec.org/p/pra/mprapa/67868.html
   My bibliography  Save this paper

Regression analysis with compositional data containing zero values

Author

Listed:
  • Tsagris, Michail

Abstract

Regression analysis, for prediction purposes, with compositional data is the subject of this paper. We examine both cases when compositional data are either response or predictor variables. A parametric model is assumed but the interest lies in the accuracy of the predicted values. For this reason, a data based power transformation is employed in both cases and the results are compared with the standard log-ratio approach. There are some interesting results and one advantage of the methods proposed here is the handling of the zero values.

Suggested Citation

  • Tsagris, Michail, 2015. "Regression analysis with compositional data containing zero values," MPRA Paper 67868, University Library of Munich, Germany.
  • Handle: RePEc:pra:mprapa:67868
    as

    Download full text from publisher

    File URL: https://mpra.ub.uni-muenchen.de/67868/1/MPRA_paper_67868.pdf
    File Function: original version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Martín-Fernández, J.A. & Hron, K. & Templ, M. & Filzmoser, P. & Palarea-Albaladejo, J., 2012. "Model-based replacement of rounded zeros in compositional data: Classical and robust approaches," Computational Statistics & Data Analysis, Elsevier, vol. 56(9), pages 2688-2704.
    2. T. Tsagris, Michail & Preston, Simon & T.A. Wood, Andrew, 2011. "A data-based power transformation for compositional data," MPRA Paper 53068, University Library of Munich, Germany.
    3. K. Hron & P. Filzmoser & K. Thompson, 2012. "Linear regression with compositional explanatory variables," Journal of Applied Statistics, Taylor & Francis Journals, vol. 39(5), pages 1115-1128, November.
    4. J. L. Scealy & A. H. Welsh, 2011. "Regression for compositional data by using distributions defined on the hypersphere," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 73(3), pages 351-375, June.
    5. Gueorguieva, Ralitza & Rosenheck, Robert & Zelterman, Daniel, 2008. "Dirichlet component regression and its applications to psychiatric data," Computational Statistics & Data Analysis, Elsevier, vol. 52(12), pages 5344-5355, August.
    6. J. Aitchison & I. J. Lauder, 1985. "Kernel Density Estimation for Compositional Data," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 34(2), pages 129-137, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jacob Fiksel & Scott Zeger & Abhirup Datta, 2022. "A transformation‐free linear regression for compositional outcomes and predictors," Biometrics, The International Biometric Society, vol. 78(3), pages 974-987, September.
    2. Matt Kammer-Kerwick & Kara Takasaki & J. Bruce Kellison & Jeff Sternberg, 2022. "Asset-Based, Sustainable Local Economic Development: Using Community Participation to Improve Quality of Life Across Rural, Small-Town, and Urban Communities," Applied Research in Quality of Life, Springer;International Society for Quality-of-Life Studies, vol. 17(5), pages 3023-3047, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tsagris, Michail & Preston, Simon & T.A. Wood, Andrew, 2016. "Improved classi cation for compositional data using the $\alpha$-transformation," MPRA Paper 67657, University Library of Munich, Germany.
    2. Michail Tsagris & Simon Preston & Andrew T. A. Wood, 2016. "Improved Classification for Compositional Data Using the α-transformation," Journal of Classification, Springer;The Classification Society, vol. 33(2), pages 243-261, July.
    3. Juan José Egozcue & Vera Pawlowsky-Glahn, 2019. "Compositional data: the sample space and its structure," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(3), pages 599-638, September.
    4. Jiajia Chen & Xiaoqin Zhang & Shengjia Li, 2017. "Multiple linear regression with compositional response and covariates," Journal of Applied Statistics, Taylor & Francis Journals, vol. 44(12), pages 2270-2285, September.
    5. Monique Graf, 2020. "Regression for compositions based on a generalization of the Dirichlet distribution," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 29(4), pages 913-936, December.
    6. Tsagris, Michail, 2014. "The k-NN algorithm for compositional data: a revised approach with and without zero values present," MPRA Paper 65866, University Library of Munich, Germany.
    7. Tsagris, Michail, 2015. "A novel, divergence based, regression for compositional data," MPRA Paper 72769, University Library of Munich, Germany.
    8. M. Templ & K. Hron & P. Filzmoser, 2017. "Exploratory tools for outlier detection in compositional data with structural zeros," Journal of Applied Statistics, Taylor & Francis Journals, vol. 44(4), pages 734-752, March.
    9. Ouimet, Frédéric & Tolosana-Delgado, Raimon, 2022. "Asymptotic properties of Dirichlet kernel density estimators," Journal of Multivariate Analysis, Elsevier, vol. 187(C).
    10. Theophilus K. Agbenyezi & Gordon Foli & Simon K. Y. Gawu, 2020. "Geochemical Characteristics Of Gold-Bearing Granitoids At Ayanfuri In The Kumasi Basin, Southwestern Ghana: Implications For The Orogenic Related Gold Systems," Earth Sciences Malaysia (ESMY), Zibeline International Publishing, vol. 4(2), pages 127-134, June.
    11. J. A. Martín-Fernández, 2021. "“Compositional Data Analysis in Practice” by Michael Greenacre Universitat Pompeu Fabra (Barcelona, Spain), Chapman and Hall/CRC, 2018," Journal of Classification, Springer;The Classification Society, vol. 38(1), pages 109-111, April.
    12. Germ`a Coenders & N'uria Arimany Serrat, 2023. "Accounting statement analysis at industry level. A gentle introduction to the compositional approach," Papers 2305.16842, arXiv.org, revised Feb 2024.
    13. Napoleón Vargas Jurado & Kent M. Eskridge & Stephen D. Kachman & Ronald M. Lewis, 2018. "Using a Bayesian Hierarchical Linear Mixing Model to Estimate Botanical Mixtures," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 23(2), pages 190-207, June.
    14. Melo, Tatiane F.N. & Vasconcellos, Klaus L.P. & Lemonte, Artur J., 2009. "Some restriction tests in a new class of regression models for proportions," Computational Statistics & Data Analysis, Elsevier, vol. 53(12), pages 3972-3979, October.
    15. Biyun Guo & Taiping Xie & M.V. Subrahmanyam, 2019. "The Impact of China’s Grain for Green Program on Rural Economy and Precipitation: A Case Study of Yan River Basin in the Loess Plateau," Sustainability, MDPI, vol. 11(19), pages 1-18, September.
    16. Thomas-Agnan, Christine & Morais, Joanna, 2019. "Covariates impacts in compositional models and simplicial derivatives," TSE Working Papers 19-1057, Toulouse School of Economics (TSE).
    17. Frédéric Ouimet, 2021. "General Formulas for the Central and Non-Central Moments of the Multinomial Distribution," Stats, MDPI, vol. 4(1), pages 1-10, January.
    18. Morais, Joanna & Simioni, Michel & Thomas-Agnan, Christine, 2016. "A tour of regression models for explaining shares," TSE Working Papers 16-742, Toulouse School of Economics (TSE).
    19. Angelo Moretti, 2023. "Estimation of small area proportions under a bivariate logistic mixed model," Quality & Quantity: International Journal of Methodology, Springer, vol. 57(4), pages 3663-3684, August.
    20. Takahiro Yoshida & Morito Tsutsumi, 2018. "On the effects of spatial relationships in spatial compositional multivariate models," Letters in Spatial and Resource Sciences, Springer, vol. 11(1), pages 57-70, March.

    More about this item

    Keywords

    Compositional data; regression; prediction; α-transformation; principal component regression;
    All these keywords.

    JEL classification:

    • C18 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Methodolical Issues: General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pra:mprapa:67868. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Joachim Winter (email available below). General contact details of provider: https://edirc.repec.org/data/vfmunde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.