IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v88y2015icp53-74.html
   My bibliography  Save this article

Unbiased regression trees for longitudinal and clustered data

Author

Listed:
  • Fu, Wei
  • Simonoff, Jeffrey S.

Abstract

A new version of the RE–EM regression tree method for longitudinal and clustered data is presented. The RE–EM tree is a methodology that combines the structure of mixed effects models for longitudinal and clustered data with the flexibility of tree-based estimation methods. The RE–EM tree is less sensitive to parametric assumptions and provides improved predictive power compared to linear models with random effects and regression trees without random effects. The previously-suggested methodology used the CART tree algorithm for tree building, and therefore that RE–EM regression tree method inherits the tendency of CART to split on variables with more possible split points at the expense of those with fewer split points. A revised version of the RE–EM regression tree corrects for this bias by using the conditional inference tree as the underlying tree algorithm instead of CART. Simulation studies show that the new version is indeed unbiased, and has several improvements over the original RE–EM regression tree in terms of prediction accuracy and the ability to recover the correct tree structure.

Suggested Citation

  • Fu, Wei & Simonoff, Jeffrey S., 2015. "Unbiased regression trees for longitudinal and clustered data," Computational Statistics & Data Analysis, Elsevier, vol. 88(C), pages 53-74.
  • Handle: RePEc:eee:csdana:v:88:y:2015:i:c:p:53-74
    DOI: 10.1016/j.csda.2015.02.004
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947315000432
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2015.02.004?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Hajjem, Ahlem & Bellavance, François & Larocque, Denis, 2011. "Mixed effects regression trees for clustered data," Statistics & Probability Letters, Elsevier, vol. 81(4), pages 451-459, April.
    2. Torsten Hothorn & Achim Zeileis, 2014. "partykit: A Modular Toolkit for Recursive Partytioning in R," Working Papers 2014-10, Faculty of Economics and Statistics, Universität Innsbruck.
    3. Dee, Thomas S. & Sela, Rebecca J., 2003. "The fatality effects of highway speed limits by gender and age," Economics Letters, Elsevier, vol. 79(3), pages 401-408, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Seung Yeoun Choi & Sean Hay Kim, 2022. "Selection of a Transparent Meta-Model Algorithm for Feasibility Analysis Stage of Energy Efficient Building Design: Clustering vs. Tree," Energies, MDPI, vol. 15(18), pages 1-25, September.
    2. Steffen Nestler & Sarah Humberg, 2022. "A Lasso and a Regression Tree Mixed-Effect Model with Random Effects for the Level, the Residual Variance, and the Autocorrelation," Psychometrika, Springer;The Psychometric Society, vol. 87(2), pages 506-532, June.
    3. Shuwen Hu & You-Gan Wang & Christopher Drovandi & Taoyun Cao, 2023. "Predictions of machine learning with mixed-effects in analyzing longitudinal data under model misspecification," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 32(2), pages 681-711, June.
    4. Karolis Matikonis & Matthew Gobey, 2024. "Small Business Property Tax Reductions and Firm Productivity," Small Business Economics, Springer, vol. 62(1), pages 307-324, January.
    5. Kim, Seheon & Rasouli, Soora & Timmermans, Harry & Yang, Dujuan, 2018. "Estimating panel effects in probabilistic representations of dynamic decision trees using bayesian generalized linear mixture models," Transportation Research Part B: Methodological, Elsevier, vol. 111(C), pages 168-184.
    6. Tsionas, Mike, 2022. "Efficiency estimation using probabilistic regression trees with an application to Chilean manufacturing industries," International Journal of Production Economics, Elsevier, vol. 249(C).
    7. Thomas Bassetti & Raul Caruso & Friedrich Schneider, 2018. "The tree of political violence: a GMERT analysis," Empirical Economics, Springer, vol. 54(2), pages 839-850, March.
    8. Anna Gottard & Giulia Vannucci & Leonardo Grilli & Carla Rampichini, 2023. "Mixed-effect models with trees," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(2), pages 431-461, June.
    9. Roberta Siciliano & Antonio D’Ambrosio & Massimo Aria & Sonia Amodio, 2017. "Analysis of Web Visit Histories, Part II: Predicting Navigation by Nested STUMP Regression Trees," Journal of Classification, Springer;The Classification Society, vol. 34(3), pages 473-493, October.
    10. Raval, Devesh & Rosenbaum, Ted & Wilson, Nathan E., 2021. "How do machine learning algorithms perform in predicting hospital choices? evidence from changing environments," Journal of Health Economics, Elsevier, vol. 78(C).
    11. Manhal Ali & Reza Salehnejad & Mohaimen Mansur, 2018. "Hospital heterogeneity: what drives the quality of health care," The European Journal of Health Economics, Springer;Deutsche Gesellschaft für Gesundheitsökonomie (DGGÖ), vol. 19(3), pages 385-408, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Patrick Krennmair & Timo Schmid, 2022. "Flexible domain prediction using mixed effects random forests," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(5), pages 1865-1894, November.
    2. Zelenkov, Yu. & Solntsev, I., 2022. "Predicting the value of professional sport clubs. A study of European soccer, 2005-2018," Journal of the New Economic Association, New Economic Association, vol. 56(4), pages 28-46.
    3. Steffen Nestler & Sarah Humberg, 2022. "A Lasso and a Regression Tree Mixed-Effect Model with Random Effects for the Level, the Residual Variance, and the Autocorrelation," Psychometrika, Springer;The Psychometric Society, vol. 87(2), pages 506-532, June.
    4. Grubinger, Thomas & Zeileis, Achim & Pfeiffer, Karl-Peter, 2014. "evtree: Evolutionary Learning of Globally Optimal Classification and Regression Trees in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 61(i01).
    5. Anderson, D. Mark & Rees, Daniel I., 2015. "Per se drugged driving laws and traffic fatalities," International Review of Law and Economics, Elsevier, vol. 42(C), pages 122-134.
    6. Daniel Albalate, 2008. "Lowering blood alcohol content levels to save lives: The European experience," Journal of Policy Analysis and Management, John Wiley & Sons, Ltd., vol. 27(1), pages 20-39.
    7. Schivinski, Bruno, 2021. "Eliciting brand-related social media engagement: A conditional inference tree framework," Journal of Business Research, Elsevier, vol. 130(C), pages 594-602.
    8. Shuwen Hu & You-Gan Wang & Christopher Drovandi & Taoyun Cao, 2023. "Predictions of machine learning with mixed-effects in analyzing longitudinal data under model misspecification," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 32(2), pages 681-711, June.
    9. Castillo-Manzano, José I. & Castro-Nuño, Mercedes & Pedregal-Tercero, Diego J., 2014. "Temporary speed limit changes: An econometric estimation of the effects of the Spanish Energy Efficiency and Saving Plan," Economic Modelling, Elsevier, vol. 44(S1), pages 68-76.
    10. Jiang, Cuiqing & Wang, Zhao & Zhao, Huimin, 2019. "A prediction-driven mixture cure model and its application in credit scoring," European Journal of Operational Research, Elsevier, vol. 277(1), pages 20-31.
    11. Tsubasa Ito & Shonosuke Sugasawa, 2023. "Grouped generalized estimating equations for longitudinal data analysis," Biometrics, The International Biometric Society, vol. 79(3), pages 1868-1879, September.
    12. Kim, Seheon & Rasouli, Soora & Timmermans, Harry & Yang, Dujuan, 2018. "Estimating panel effects in probabilistic representations of dynamic decision trees using bayesian generalized linear mixture models," Transportation Research Part B: Methodological, Elsevier, vol. 111(C), pages 168-184.
    13. Mercedes Castro-Nuno & José I. Castillo-Manzano & Diego J. Pedregal-Tercero, 2013. "The Speed Limits Debate: Is Effective A Temporary Change? The Case Of Spain," ERSA conference papers ersa13p160, European Regional Science Association.
    14. Wagner Martin & Zeileis Achim, 2019. "Heterogeneity and Spatial Dependence of Regional Growth in the EU: A Recursive Partitioning Approach," German Economic Review, De Gruyter, vol. 20(1), pages 67-82, February.
    15. Gustavsson, Magnus & Osterholm, Par, 2006. "The informational value of unemployment statistics: A note on the time series properties of participation rates," Economics Letters, Elsevier, vol. 92(3), pages 428-433, September.
    16. D. Mark Anderson & Benjamin Hansen & Daniel I. Rees, 2013. "Medical Marijuana Laws, Traffic Fatalities, and Alcohol Consumption," Journal of Law and Economics, University of Chicago Press, vol. 56(2), pages 333-369.
    17. Anderson, D. Mark & Rees, Daniel I., 2012. "Per Se Drugged Driving Laws and Traffic Fatalities," IZA Discussion Papers 7048, Institute of Labor Economics (IZA).
    18. Hajjem, Ahlem & Larocque, Denis & Bellavance, François, 2017. "Generalized mixed effects regression trees," Statistics & Probability Letters, Elsevier, vol. 126(C), pages 114-118.
    19. Peter Calhoun & Richard A. Levine & Juanjuan Fan, 2021. "Repeated measures random forests (RMRF): Identifying factors associated with nocturnal hypoglycemia," Biometrics, The International Biometric Society, vol. 77(1), pages 343-351, March.
    20. Tomasz Melcer & Monika E Danielewska & D Robert Iskander, 2015. "Wavelet Representation of the Corneal Pulse for Detecting Ocular Dicrotism," PLOS ONE, Public Library of Science, vol. 10(4), pages 1-13, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:88:y:2015:i:c:p:53-74. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.