IDEAS home Printed from https://ideas.repec.org/p/trr/wpaper/201916.html
   My bibliography  Save this paper

Data-driven transformations and survey-weighting for linear mixed models

Author

Listed:
  • Patricia Dörr
  • Jan Pablo Burgard

Abstract

Many variables that social and economic researchers seek to analyze through regression analysis violate normality assumptions. A standard remedy in that case is the logarithmic transformation. However, taking logarithms is not always sufficient to reestablish model assumptions. A more general approach is to determine a family of transformations and to estimate the adequate parameter of such a transformation. This can also be done in mixed effects models, which can account for unobserved heterogeneity in grouped data. When the analyzed data is gathered from a complex survey whose design is informative for the model - which is difficult to exclude a priori - a bias on the transformed linear mixed models can occur. As the bias affects the transformation parameter, too, the distortion to the parameters in the population is even more problematic than in standard regression. In standard regression, survey weights are used to account for the design. To the best of our knowledge, none of the existing algorithms allows to include survey weights in these transformed linear mixed models. This paper adapts a recently suggested algorithm to include survey weights to Box-Cox or dual transformed mixed models. A simulation study demonstrates the need to account for informative survey design.

Suggested Citation

  • Patricia Dörr & Jan Pablo Burgard, 2019. "Data-driven transformations and survey-weighting for linear mixed models," Research Papers in Economics 2019-16, University of Trier, Department of Economics.
  • Handle: RePEc:trr:wpaper:201916
    as

    Download full text from publisher

    File URL: http://www.uni-trier.de/fileadmin/fb4/prof/VWL/EWF/Research_Papers/2019-16.pdf
    File Function: First version, 2019
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Sophia Rabe‐Hesketh & Anders Skrondal, 2006. "Multilevel modelling of complex survey data," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 169(4), pages 805-827, October.
    2. Spitzer, John J, 1982. "A Primer on Box-Cox Estimation," The Review of Economics and Statistics, MIT Press, vol. 64(2), pages 307-313, May.
    3. J. G. Booth & J. P. Hobert, 1999. "Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 61(1), pages 265-285.
    4. Yang, Zhenlin, 2006. "A modified family of power transformations," Economics Letters, Elsevier, vol. 92(1), pages 14-19, July.
    5. D. Pfeffermann & C. J. Skinner & D. J. Holmes & H. Goldstein & J. Rasbash, 1998. "Weighting for unequal selection probabilities in multilevel models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 60(1), pages 23-40.
    6. Matthew J. Gurka & Lloyd J. Edwards & Keith E. Muller & Lawrence L. Kupper, 2006. "Extending the Box–Cox transformation to the linear mixed model," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 169(2), pages 273-288, March.
    7. R. Sakia, 1990. "Retransformation bias: A look at the box-cox transformation to linear balanced mixed ANOVA models," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 37(1), pages 345-351, December.
    8. Jan Pablo Burgard & Patricia Dörr, 2018. "Survey-weighted Generalized Linear Mixed Models," Research Papers in Economics 2018-01, University of Trier, Department of Economics.
    9. Shonosuke Sugasawa & Tatsuya Kubokawa, 2015. "Box-Cox Transformed Linear Mixed Models for Positive-Valued and Clustered Data," CIRJE F-Series CIRJE-F-957, CIRJE, Faculty of Economics, University of Tokyo.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Huapeng Li & Yukun Liu & Riquan Zhang, 2019. "Small area estimation under transformed nested-error regression models," Statistical Papers, Springer, vol. 60(4), pages 1397-1418, August.
    2. Nora Würz & Timo Schmid & Nikos Tzavidis, 2022. "Estimating regional income indicators under transformations and access to limited population auxiliary information," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(4), pages 1679-1706, October.
    3. Jan Pablo Burgard & Patricia Dörr, 2018. "Survey-weighted Generalized Linear Mixed Models," Research Papers in Economics 2018-01, University of Trier, Department of Economics.
    4. Jan Pablo Burgard & Patricia Dörr & Ralf Münnich, 2020. "Monte-Carlo Simulation Studies in Survey Statistics – An Appraisal," Research Papers in Economics 2020-04, University of Trier, Department of Economics.
    5. Woojin Chung & Roeul Kim, 2020. "A Reversal of the Association between Education Level and Obesity Risk during Ageing: A Gender-Specific Longitudinal Study in South Korea," IJERPH, MDPI, vol. 17(18), pages 1-19, September.
    6. Joseph L Dieleman & Tara Templin, 2014. "Random-Effects, Fixed-Effects and the within-between Specification for Clustered Data in Observational Health Studies: A Simulation Study," PLOS ONE, Public Library of Science, vol. 9(10), pages 1-17, October.
    7. Woojin Chung & Roeul Kim, 2020. "Differential Risk of Cognitive Impairment across Paid and Unpaid Occupations in the Middle-Age Population: Evidence from the Korean Longitudinal Study of Aging, 2006–2016," IJERPH, MDPI, vol. 17(9), pages 1-14, April.
    8. Laura M. Stapleton & Yoonjeong Kang, 2018. "Design Effects of Multilevel Estimates From National Probability Samples," Sociological Methods & Research, , vol. 47(3), pages 430-457, August.
    9. Woojin Chung & Roeul Kim, 2020. "Which Occupation is Highly Associated with Cognitive Impairment? A Gender-Specific Longitudinal Study of Paid and Unpaid Occupations in South Korea," IJERPH, MDPI, vol. 17(21), pages 1-17, October.
    10. Natalia Rojas‐Perilla & Sören Pannier & Timo Schmid & Nikos Tzavidis, 2020. "Data‐driven transformations in small area estimation," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(1), pages 121-148, January.
    11. Robert G. Clark & David G. Steel, 2022. "Sample design for analysis using high‐influence probability sampling," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(4), pages 1733-1756, October.
    12. Francesco Schirripa Spagnolo & Nicola Salvati & Antonella D’Agostino & Ides Nicaise, 2020. "The use of sampling weights in M‐quantile random‐effects regression: an application to Programme for International Student Assessment mathematics scores," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 69(4), pages 991-1012, August.
    13. Corder Nathan & Yang Shu, 2020. "Estimating Average Treatment Effects Utilizing Fractional Imputation when Confounders are Subject to Missingness," Journal of Causal Inference, De Gruyter, vol. 8(1), pages 249-271, January.
    14. Bowen, Mary Elizabeth, 2009. "Childhood socioeconomic status and racial differences in disability: Evidence from the Health and Retirement Study (1998-2006)," Social Science & Medicine, Elsevier, vol. 69(3), pages 433-441, August.
    15. Ana Maria Osorio & Catalina Bolancé & Nyovane Madise & Katharina Rathmann, 2013. "Social Determinants of Child Health in Colombia: Can Community Education Moderate the Effect of Family Characteristics?," Working Papers XREAP2013-02, Xarxa de Referència en Economia Aplicada (XREAP), revised Mar 2013.
    16. Amini, Chiara & Nivorozhkin, Eugene, 2015. "The urban–rural divide in educational outcomes: Evidence from Russia," International Journal of Educational Development, Elsevier, vol. 44(C), pages 118-133.
    17. Yergeau, Marie-Eve, 2020. "Tourism and local welfare: A multilevel analysis in Nepal’s protected areas," World Development, Elsevier, vol. 127(C).
    18. Glen McGee & Jonathan Schildcrout & Sharon‐Lise Normand & Sebastien Haneuse, 2020. "Outcome‐dependent sampling in cluster‐correlated data settings with application to hospital profiling," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(1), pages 379-402, January.
    19. Oǧuz-Alper, Melike & Berger, Yves G., 2020. "Modelling multilevel data under complex sampling designs: An empirical likelihood approach," Computational Statistics & Data Analysis, Elsevier, vol. 145(C).
    20. Corder Nathan & Yang Shu, 2020. "Estimating Average Treatment Effects Utilizing Fractional Imputation when Confounders are Subject to Missingness," Journal of Causal Inference, De Gruyter, vol. 8(1), pages 249-271, January.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:trr:wpaper:201916. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Matthias Neuenkirch (email available below). General contact details of provider: https://edirc.repec.org/data/petride.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.