IDEAS home Printed from https://ideas.repec.org/a/spr/compst/v38y2023i4d10.1007_s00180-022-01302-8.html
   My bibliography  Save this article

An evolutionary estimation procedure for generalized semilinear regression trees

Author

Listed:
  • Giulia Vannucci

    (University of Florence)

  • Anna Gottard

    (University of Florence)

Abstract

In many applications, the presence of interactions or even mild non-linearities can affect inference and predictions. For that reason, we suggest the use of a class of models laying between statistics and machine learning and we propose a learning procedure. The models combine a linear part and a tree component that is selected via an evolutionary algorithm, and they can be adopted for any kinds of response, such as, for instance, continuous, categorical, ordinal responses, and survival times. They are inherently interpretable but more flexible than standard regression models, as they easily capture non-linear and interaction effects. The proposed genetic-like learning algorithm allows avoiding a greedy search of the tree component. In a simulation study, we show that the proposed approach has a performance comparable with other machine learning algorithms, with a substantial gain in interpretability and transparency, and we illustrate the method on a real data set.

Suggested Citation

  • Giulia Vannucci & Anna Gottard, 2023. "An evolutionary estimation procedure for generalized semilinear regression trees," Computational Statistics, Springer, vol. 38(4), pages 1927-1946, December.
  • Handle: RePEc:spr:compst:v:38:y:2023:i:4:d:10.1007_s00180-022-01302-8
    DOI: 10.1007/s00180-022-01302-8
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00180-022-01302-8
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00180-022-01302-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Claudio Conversano & Elise Dusseldorp, 2017. "Modeling Threshold Interaction Effects Through the Logistic Classification Trunk," Journal of Classification, Springer;The Classification Society, vol. 34(3), pages 399-426, October.
    2. Bradley Efron, 2020. "Prediction, Estimation, and Attribution," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 115(530), pages 636-655, April.
    3. Ruoqing Zhu & Donglin Zeng & Michael R. Kosorok, 2015. "Reinforcement Learning Trees," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(512), pages 1770-1784, December.
    4. Chatterjee, Sangit & Laudato, Matthew & Lynch, Lucy A., 1996. "Genetic algorithms and their statistical applications: an introduction," Computational Statistics & Data Analysis, Elsevier, vol. 22(6), pages 633-651, October.
    5. Grubinger, Thomas & Zeileis, Achim & Pfeiffer, Karl-Peter, 2014. "evtree: Evolutionary Learning of Globally Optimal Classification and Regression Trees in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 61(i01).
    6. Elise Dusseldorp & Jacqueline Meulman, 2004. "The regression trunk approach to discover treatment covariate interaction," Psychometrika, Springer;The Psychometric Society, vol. 69(3), pages 355-374, September.
    7. Bradley Efron, 2020. "Prediction, Estimation, and Attribution," International Statistical Review, International Statistical Institute, vol. 88(S1), pages 28-59, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Anna Gottard & Giulia Vannucci & Leonardo Grilli & Carla Rampichini, 2023. "Mixed-effect models with trees," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(2), pages 431-461, June.
    2. Benítez-Peña, Sandra & Carrizosa, Emilio & Guerrero, Vanesa & Jiménez-Gamero, M. Dolores & Martín-Barragán, Belén & Molero-Río, Cristina & Ramírez-Cobo, Pepa & Romero Morales, Dolores & Sillero-Denami, 2021. "On sparse ensemble methods: An application to short-term predictions of the evolution of COVID-19," European Journal of Operational Research, Elsevier, vol. 295(2), pages 648-663.
    3. Manski, Charles F., 2023. "Probabilistic prediction for binary treatment choice: With focus on personalized medicine," Journal of Econometrics, Elsevier, vol. 234(2), pages 647-663.
    4. Weishampel, Anthony & Staicu, Ana-Maria & Rand, William, 2023. "Classification of social media users with generalized functional data analysis," Computational Statistics & Data Analysis, Elsevier, vol. 179(C).
    5. Nelson P. Rayl & Nitish R. Sinha, 2022. "Integrating Prediction and Attribution to Classify News," Finance and Economics Discussion Series 2022-042, Board of Governors of the Federal Reserve System (U.S.).
    6. Bas Bosma & Arjen Witteloostuijn, 2024. "Machine learning in international business," Journal of International Business Studies, Palgrave Macmillan;Academy of International Business, vol. 55(6), pages 676-702, August.
    7. Denis A Shah & Erick D De Wolf & Pierce A Paul & Laurence V Madden, 2021. "Accuracy in the prediction of disease epidemics when ensembling simple but highly correlated models," PLOS Computational Biology, Public Library of Science, vol. 17(3), pages 1-23, March.
    8. Paolo Libenzio Brignoli & Alessandro Varacca & Cornelis Gardebroek & Paolo Sckokai, 2024. "Machine learning to predict grains futures prices," Agricultural Economics, International Association of Agricultural Economists, vol. 55(3), pages 479-497, May.
    9. Emmanuel Flachaire & Sullivan Hué & Sébastien Laurent & Gilles Hacheme, 2024. "Interpretable Machine Learning Using Partial Linear Models," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 86(3), pages 519-540, June.
    10. M. Merz & R. Richman & T. Tsanakas & M. V. Wuthrich, 2021. "Interpreting Deep Learning Models with Marginal Attribution by Conditioning on Quantiles," Papers 2103.11706, arXiv.org.
    11. Rich, Jeppe & Myhrmann, Marcus Skyum & Mabit, Stefan Eriksen, 2023. "Our children cycle less - A Danish pseudo-panel analysis," Journal of Transport Geography, Elsevier, vol. 106(C).
    12. Chun Chieh Fan & Robert Loughnan & Carolina Makowski & Diliana Pecheva & Chi-Hua Chen & Donald J. Hagler & Wesley K. Thompson & Nadine Parker & Dennis van der Meer & Oleksandr Frei & Ole A. Andreassen, 2022. "Multivariate genome-wide association study on tissue-sensitive diffusion metrics highlights pathways that shape the human brain," Nature Communications, Nature, vol. 13(1), pages 1-10, December.
    13. Jack Jewson & David Rossell, 2022. "General Bayesian loss function selection and the use of improper models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(5), pages 1640-1665, November.
    14. Alessio Baldassarre & Elise Dusseldorp & Antonio D’Ambrosio & Mark de Rooij & Claudio Conversano, 2023. "The Bradley–Terry Regression Trunk approach for Modeling Preference Data with Small Trees," Psychometrika, Springer;The Psychometric Society, vol. 88(4), pages 1443-1465, December.
    15. COJOCARIU Irina-Cristina, 2023. "Analysis Of Sports Performances Using Machine Learning And Statistical Models - A General Analysis Of The Literature," Revista Economica, Lucian Blaga University of Sibiu, Faculty of Economic Sciences, vol. 75(2), pages 34-39, June.
    16. Ord, J. Keith, 2022. "The uncertainty track: Machine learning, statistical modeling, synthesis," International Journal of Forecasting, Elsevier, vol. 38(4), pages 1526-1530.
    17. Jeff Dominitz & Charles F. Manski, 2024. "Comprehensive OOS Evaluation of Predictive Algorithms with Statistical Decision Theory," Papers 2403.11016, arXiv.org, revised Apr 2025.
    18. Takahiro Yoshida & Daisuke Murakami & Hajime Seya, 2024. "Spatial Prediction of Apartment Rent using Regression-Based and Machine Learning-Based Approaches with a Large Dataset," The Journal of Real Estate Finance and Economics, Springer, vol. 69(1), pages 1-28, July.
    19. Victor Quintas-Martinez & Mohammad Taha Bahadori & Eduardo Santiago & Jeff Mu & Dominik Janzing & David Heckerman, 2024. "Multiply-Robust Causal Change Attribution," Papers 2404.08839, arXiv.org, revised Sep 2024.
    20. Gerhard Tutz & Moritz Berger, 2018. "Tree-structured modelling of categorical predictors in generalized additive regression," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 12(3), pages 737-758, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:compst:v:38:y:2023:i:4:d:10.1007_s00180-022-01302-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.