IDEAS home Printed from https://ideas.repec.org/p/pra/mprapa/67868.html

Regression analysis with compositional data containing zero values

Author

Listed:
  • Tsagris, Michail

Abstract

Regression analysis, for prediction purposes, with compositional data is the subject of this paper. We examine both cases when compositional data are either response or predictor variables. A parametric model is assumed but the interest lies in the accuracy of the predicted values. For this reason, a data based power transformation is employed in both cases and the results are compared with the standard log-ratio approach. There are some interesting results and one advantage of the methods proposed here is the handling of the zero values.

Suggested Citation

  • Tsagris, Michail, 2015. "Regression analysis with compositional data containing zero values," MPRA Paper 67868, University Library of Munich, Germany.
  • Handle: RePEc:pra:mprapa:67868
    as

    Download full text from publisher

    File URL: https://mpra.ub.uni-muenchen.de/67868/1/MPRA_paper_67868.pdf
    File Function: original version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. J. Aitchison & I. J. Lauder, 1985. "Kernel Density Estimation for Compositional Data," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 34(2), pages 129-137, June.
    2. Martín-Fernández, J.A. & Hron, K. & Templ, M. & Filzmoser, P. & Palarea-Albaladejo, J., 2012. "Model-based replacement of rounded zeros in compositional data: Classical and robust approaches," Computational Statistics & Data Analysis, Elsevier, vol. 56(9), pages 2688-2704.
    3. T. Tsagris, Michail & Preston, Simon & T.A. Wood, Andrew, 2011. "A data-based power transformation for compositional data," MPRA Paper 53068, University Library of Munich, Germany.
    4. K. Hron & P. Filzmoser & K. Thompson, 2012. "Linear regression with compositional explanatory variables," Journal of Applied Statistics, Taylor & Francis Journals, vol. 39(5), pages 1115-1128, November.
    5. J. L. Scealy & A. H. Welsh, 2011. "Regression for compositional data by using distributions defined on the hypersphere," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 73(3), pages 351-375, June.
    6. Gueorguieva, Ralitza & Rosenheck, Robert & Zelterman, Daniel, 2008. "Dirichlet component regression and its applications to psychiatric data," Computational Statistics & Data Analysis, Elsevier, vol. 52(12), pages 5344-5355, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jacob Fiksel & Scott Zeger & Abhirup Datta, 2022. "A transformation‐free linear regression for compositional outcomes and predictors," Biometrics, The International Biometric Society, vol. 78(3), pages 974-987, September.
    2. Matt Kammer-Kerwick & Kara Takasaki & J. Bruce Kellison & Jeff Sternberg, 2022. "Asset-Based, Sustainable Local Economic Development: Using Community Participation to Improve Quality of Life Across Rural, Small-Town, and Urban Communities," Applied Research in Quality of Life, Springer;International Society for Quality-of-Life Studies, vol. 17(5), pages 3023-3047, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tsagris, Michail & Preston, Simon & T.A. Wood, Andrew, 2016. "Improved classi cation for compositional data using the $\alpha$-transformation," MPRA Paper 67657, University Library of Munich, Germany.
    2. Michail Tsagris & Simon Preston & Andrew T. A. Wood, 2016. "Improved Classification for Compositional Data Using the α-transformation," Journal of Classification, Springer;The Classification Society, vol. 33(2), pages 243-261, July.
    3. Juan José Egozcue & Vera Pawlowsky-Glahn, 2019. "Compositional data: the sample space and its structure," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(3), pages 599-638, September.
    4. Jiajia Chen & Xiaoqin Zhang & Shengjia Li, 2017. "Multiple linear regression with compositional response and covariates," Journal of Applied Statistics, Taylor & Francis Journals, vol. 44(12), pages 2270-2285, September.
    5. Tsagris, Michail, 2014. "The k-NN algorithm for compositional data: a revised approach with and without zero values present," MPRA Paper 65866, University Library of Munich, Germany.
    6. Tsagris, Michail, 2015. "A novel, divergence based, regression for compositional data," MPRA Paper 72769, University Library of Munich, Germany.
    7. M. Templ & K. Hron & P. Filzmoser, 2017. "Exploratory tools for outlier detection in compositional data with structural zeros," Journal of Applied Statistics, Taylor & Francis Journals, vol. 44(4), pages 734-752, March.
    8. Yoon, Changwon & Choi, Hyunbin & Ahn, Jeongyoun, 2025. "Kernel density estimation for compositional data with zeros via hypersphere mapping," Computational Statistics & Data Analysis, Elsevier, vol. 212(C).
    9. Monique Graf, 2020. "Regression for compositions based on a generalization of the Dirichlet distribution," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 29(4), pages 913-936, December.
    10. Nikola Štefelová & Andreas Alfons & Javier Palarea-Albaladejo & Peter Filzmoser & Karel Hron, 2021. "Robust regression with compositional covariates including cellwise outliers," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 15(4), pages 869-909, December.
    11. Ouimet, Frédéric & Tolosana-Delgado, Raimon, 2022. "Asymptotic properties of Dirichlet kernel density estimators," Journal of Multivariate Analysis, Elsevier, vol. 187(C).
    12. Mauricio Velasquez, 2016. "Compositions vs Gini: A new metric to evaluate the effects of land-income disparities," 2016 Papers pve364, Job Market Papers.
    13. Janina Janurek & Sascha Abdel Hadi & Andreas Mojzisch & Jan Alexander Häusser, 2018. "The Association of the 24 Hour Distribution of Time Spent in Physical Activity, Work, and Sleep with Emotional Exhaustion," IJERPH, MDPI, vol. 15(9), pages 1-14, September.
    14. Xiongtao Dai & Zhenhua Lin & Hans‐Georg Müller, 2021. "Modeling sparse longitudinal data on Riemannian manifolds," Biometrics, The International Biometric Society, vol. 77(4), pages 1328-1341, December.
    15. Theophilus K. Agbenyezi & Gordon Foli & Simon K. Y. Gawu, 2020. "Geochemical Characteristics Of Gold-Bearing Granitoids At Ayanfuri In The Kumasi Basin, Southwestern Ghana: Implications For The Orogenic Related Gold Systems," Earth Sciences Malaysia (ESMY), Zibeline International Publishing, vol. 4(2), pages 127-134, June.
    16. J. A. Martín-Fernández, 2021. "“Compositional Data Analysis in Practice” by Michael Greenacre Universitat Pompeu Fabra (Barcelona, Spain), Chapman and Hall/CRC, 2018," Journal of Classification, Springer;The Classification Society, vol. 38(1), pages 109-111, April.
    17. Andriansyah, Andriansyah & Messinis, George, 2016. "Intended use of IPO proceeds and firm performance: A quantile regression approach," Pacific-Basin Finance Journal, Elsevier, vol. 36(C), pages 14-30.
    18. Germ`a Coenders & N'uria Arimany Serrat, 2023. "Accounting statement analysis at industry level. A gentle introduction to the compositional approach," Papers 2305.16842, arXiv.org, revised Jun 2025.
    19. Napoleón Vargas Jurado & Kent M. Eskridge & Stephen D. Kachman & Ronald M. Lewis, 2018. "Using a Bayesian Hierarchical Linear Mixing Model to Estimate Botanical Mixtures," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 23(2), pages 190-207, June.
    20. Christian Genest & Frédéric Ouimet, 2025. "Local linear smoothing for regression surfaces on the simplex using Dirichlet kernels," Statistical Papers, Springer, vol. 66(4), pages 1-28, June.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;

    JEL classification:

    • C18 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Methodolical Issues: General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pra:mprapa:67868. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Joachim Winter (email available below). General contact details of provider: https://edirc.repec.org/data/vfmunde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.