IDEAS home Printed from https://ideas.repec.org/a/spr/aodasc/v12y2025i3d10.1007_s40745-024-00541-4.html
   My bibliography  Save this article

Combining LASSO-type Methods with a Smooth Transition Random Forest

Author

Listed:
  • Alexandre L. D. Gandini

    (Universidade Federal do Rio Grande do Sul)

  • Flavio A. Ziegelmann

    (Universidade Federal do Rio Grande do Sul)

Abstract

In this work, we propose a novel hybrid method for the estimation of regression models, which is based on a combination of LASSO-type methods and smooth transition (STR) random forests. Tree-based regression models are known for their flexibility and skills to learn even very nonlinear patterns. The STR-Tree model introduces smoothness into traditional splitting nodes, leading to a non-binary labeling, which can be interpreted as a group membership degree for each observation. Our approach involves two steps. First, we fit a penalized linear regression using LASSO-type methods. Then, we estimate an STR random forest on the residuals from the first step, using the original covariates. This dual-step process allows us to capture any significant linear relationships in the data generating process through a parametric approach, and then addresses nonlinearities with a flexible model. We conducted numerical studies with both simulated and real data to demonstrate our method’s effectiveness. Our findings indicate that our proposal offers superior predictive power, particularly in datasets with both linear and nonlinear characteristics, when compared to traditional benchmarks.

Suggested Citation

  • Alexandre L. D. Gandini & Flavio A. Ziegelmann, 2025. "Combining LASSO-type Methods with a Smooth Transition Random Forest," Annals of Data Science, Springer, vol. 12(3), pages 899-928, June.
  • Handle: RePEc:spr:aodasc:v:12:y:2025:i:3:d:10.1007_s40745-024-00541-4
    DOI: 10.1007/s40745-024-00541-4
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s40745-024-00541-4
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s40745-024-00541-4?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Evandro Konzen & Flavio A. Ziegelmann, 2016. "LASSO‐Type Penalties for Covariate Selection and Forecasting in Time Series," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 35(7), pages 592-612, November.
    2. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    3. Peter Calhoun & Melodie J. Hallett & Xiaogang Su & Guy Cafri & Richard A. Levine & Juanjuan Fan, 2020. "Random forest with acceptance–rejection trees," Computational Statistics, Springer, vol. 35(3), pages 983-999, September.
    4. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    5. Lin, Yi & Jeon, Yongho, 2006. "Random Forests and Adaptive Nearest Neighbors," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 578-590, June.
    6. da Rosa, Joel Correa & Veiga, Alvaro & Medeiros, Marcelo C., 2008. "Tree-structured smooth transition regression models," Computational Statistics & Data Analysis, Elsevier, vol. 52(5), pages 2469-2488, January.
    7. James M. Tien, 2017. "Internet of Things, Real-Time Decision Making, and Artificial Intelligence," Annals of Data Science, Springer, vol. 4(2), pages 149-178, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ricardo P. Masini & Marcelo C. Medeiros & Eduardo F. Mendes, 2023. "Machine learning advances for time series forecasting," Journal of Economic Surveys, Wiley Blackwell, vol. 37(1), pages 76-111, February.
    2. Saulius Jokubaitis & Dmitrij Celov & Remigijus Leipus, 2019. "Sparse structures with LASSO through Principal Components: forecasting GDP components in the short-run," Papers 1906.07992, arXiv.org, revised Oct 2020.
    3. Reetika Sarkar & Sithija Manage & Xiaoli Gao, 2024. "Stable Variable Selection for High-Dimensional Genomic Data with Strong Correlations," Annals of Data Science, Springer, vol. 11(4), pages 1139-1164, August.
    4. Jokubaitis, Saulius & Celov, Dmitrij & Leipus, Remigijus, 2021. "Sparse structures with LASSO through principal components: Forecasting GDP components in the short-run," International Journal of Forecasting, Elsevier, vol. 37(2), pages 759-776.
    5. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    6. Xu, Yang & Zhao, Shishun & Hu, Tao & Sun, Jianguo, 2021. "Variable selection for generalized odds rate mixture cure models with interval-censored failure time data," Computational Statistics & Data Analysis, Elsevier, vol. 156(C).
    7. Emmanouil Androulakis & Christos Koukouvinos & Kalliopi Mylona & Filia Vonta, 2010. "A real survival analysis application via variable selection methods for Cox's proportional hazards model," Journal of Applied Statistics, Taylor & Francis Journals, vol. 37(8), pages 1399-1406.
    8. Jun Zhu & Hsin‐Cheng Huang & Perla E. Reyes, 2010. "On selection of spatial linear models for lattice data," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 72(3), pages 389-402, June.
    9. Lam, Clifford, 2008. "Estimation of large precision matrices through block penalization," LSE Research Online Documents on Economics 31543, London School of Economics and Political Science, LSE Library.
    10. Ping Wu & Xinchao Luo & Peirong Xu & Lixing Zhu, 2017. "New variable selection for linear mixed-effects models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 69(3), pages 627-646, June.
    11. Naimoli, Antonio, 2022. "Modelling the persistence of Covid-19 positivity rate in Italy," Socio-Economic Planning Sciences, Elsevier, vol. 82(PA).
    12. Xia Chen & Liyue Mao, 2020. "Penalized empirical likelihood for partially linear errors-in-variables models," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 104(4), pages 597-623, December.
    13. Jian Guo & Elizaveta Levina & George Michailidis & Ji Zhu, 2010. "Pairwise Variable Selection for High-Dimensional Model-Based Clustering," Biometrics, The International Biometric Society, vol. 66(3), pages 793-804, September.
    14. Xiaotong Shen & Wei Pan & Yunzhang Zhu & Hui Zhou, 2013. "On constrained and regularized high-dimensional regression," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 65(5), pages 807-832, October.
    15. Oguzhan Cepni & I. Ethem Guney & Norman R. Swanson, 2020. "Forecasting and nowcasting emerging market GDP growth rates: The role of latent global economic policy uncertainty and macroeconomic data surprise factors," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 39(1), pages 18-36, January.
    16. repec:hum:wpaper:sfb649dp2016-047 is not listed on IDEAS
    17. Umberto Amato & Anestis Antoniadis & Italia De Feis & Irene Gijbels, 2021. "Penalised robust estimators for sparse and high-dimensional linear models," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(1), pages 1-48, March.
    18. Tizheng Li & Xiaojuan Kang, 2022. "Variable selection of higher-order partially linear spatial autoregressive model with a diverging number of parameters," Statistical Papers, Springer, vol. 63(1), pages 243-285, February.
    19. Alexandre Belloni & Victor Chernozhukov & Ivan Fernandez-Val & Christian Hansen, 2013. "Program evaluation with high-dimensional data," CeMMAP working papers CWP77/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    20. Shan Luo & Zehua Chen, 2014. "Sequential Lasso Cum EBIC for Feature Selection With Ultra-High Dimensional Feature Space," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(507), pages 1229-1240, September.
    21. Caner, Mehmet & Fan, Qingliang, 2015. "Hybrid generalized empirical likelihood estimators: Instrument selection with adaptive lasso," Journal of Econometrics, Elsevier, vol. 187(1), pages 256-274.

    More about this item

    Keywords

    ;
    ;
    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:aodasc:v:12:y:2025:i:3:d:10.1007_s40745-024-00541-4. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.