IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v328y2026i2p607-619.html

Soft regression trees: A model variant and a decomposition training algorithm

Author

Listed:
  • Consolo, Antonio
  • Amaldi, Edoardo
  • Manno, Andrea

Abstract

Decision trees are widely used for classification and regression tasks in a variety of application fields due to their interpretability and good accuracy. During the past decade, growing attention has been devoted to globally optimized decision trees with deterministic or soft splitting rules at branch nodes, which are trained by optimizing the error function over all the tree parameters. In this work, we propose a new variant of soft multivariate regression trees (SRTs) where, for every input vector, the prediction is defined as the linear regression associated to a single leaf node, namely, the leaf node obtained by routing the input vector from the root along the branches with higher probability. SRTs exhibit the conditional computational property, i.e., each prediction depends on a small number of nodes (parameters), and our nonlinear optimization formulation for training them is amenable to decomposition. After showing a universal approximation result for SRTs, we present a decomposition training algorithm including a clustering-based initialization procedure and a heuristic for rerouting the input vectors along the tree. Under mild assumptions, we establish asymptotic convergence guarantees. Experiments on 15 well-known datasets indicate that our SRTs and decomposition algorithm yield higher accuracy and robustness compared with traditional soft regression trees trained using the nonlinear optimization formulation of Blanquero et al. (2021), and a significant reduction in training times as well as a slightly better average accuracy compared with the mixed-integer optimization approach of Bertsimas and Dunn (2019). We also report a comparison with the Random Forest ensemble method.

Suggested Citation

  • Consolo, Antonio & Amaldi, Edoardo & Manno, Andrea, 2026. "Soft regression trees: A model variant and a decomposition training algorithm," European Journal of Operational Research, Elsevier, vol. 328(2), pages 607-619.
  • Handle: RePEc:eee:ejores:v:328:y:2026:i:2:p:607-619
    DOI: 10.1016/j.ejor.2025.08.050
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221725006873
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2025.08.050?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to

    for a different version of it.

    References listed on IDEAS

    as
    1. Blanquero, Rafael & Carrizosa, Emilio & Molero-Río, Cristina & Romero Morales, Dolores, 2020. "Sparsity in optimal randomized classification trees," European Journal of Operational Research, Elsevier, vol. 284(1), pages 255-272.
    2. Ali Aouad & Adam N. Elmachtoub & Kris J. Ferreira & Ryan McNellis, 2023. "Market Segmentation Trees," Manufacturing & Service Operations Management, INFORMS, vol. 25(2), pages 648-667, March.
    3. Dragos Florin Ciocan & Velibor V. Mišić, 2022. "Interpretable Optimal Stopping," Management Science, INFORMS, vol. 68(3), pages 1616-1638, March.
    4. Yu, Hao & Cooper, Arthur R. & Infante, Dana M., 2020. "Improving species distribution model predictive accuracy using species abundance: Application with boosted regression trees," Ecological Modelling, Elsevier, vol. 432(C).
    5. Blanquero, Rafael & Carrizosa, Emilio & Molero-Río, Cristina & Morales, Dolores Romero, 2022. "On sparse optimal regression trees," European Journal of Operational Research, Elsevier, vol. 299(3), pages 1045-1054.
    6. Léon Bottou, 2010. "Large-Scale Machine Learning with Stochastic Gradient Descent," Springer Books, in: Yves Lechevallier & Gilbert Saporta (ed.), Proceedings of COMPSTAT'2010, pages 177-186, Springer.
    7. Andrea Manno & Laura Palagi & Simone Sagratella, 2018. "Parallel decomposition methods for linearly constrained problems subject to simple bound with application to the SVMs training," Computational Optimization and Applications, Springer, vol. 71(1), pages 115-145, September.
    8. Asil Oztekin, 2018. "Creating a marketing strategy in healthcare industry: a holistic data analytic approach," Annals of Operations Research, Springer, vol. 270(1), pages 361-382, November.
    9. Sina Aghaei & Andrés Gómez & Phebe Vayanos, 2025. "Strong Optimal Classification Trees," Operations Research, INFORMS, vol. 73(4), pages 2223-2241, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tu, Jiancheng & Wu, Zhibin, 2025. "Inherently interpretable machine learning for credit scoring: Optimal classification tree with hyperplane splits," European Journal of Operational Research, Elsevier, vol. 322(2), pages 647-664.
    2. Tian, Xuecheng & Wang, Shuaian & Zhen, Lu & Shen, Zuo-Jun (Max), 2025. "k-Tree: Crossing sharp boundaries in regression trees to find neighbors," European Journal of Operational Research, Elsevier, vol. 324(2), pages 567-579.
    3. Tommaso Aldinucci & Matteo Lapucci, 2024. "Loss-optimal classification trees: a generalized framework and the logistic case," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 32(2), pages 323-350, July.
    4. N. Bora Keskin & Yuexing Li & Nur Sunar, 2025. "Data-Driven Clustering and Feature-Based Retail Electricity Pricing with Smart Meters," Operations Research, INFORMS, vol. 73(5), pages 2636-2660, September.
    5. Xiaohong Chen & Elie Tamer & Qingsong Yao, 2026. "Online Learning in Semiparametric Econometric Models," Papers 2603.08614, arXiv.org.
    6. Ferdinand Bollwein, 2024. "A pivot-based simulated annealing algorithm to determine oblique splits for decision tree induction," Computational Statistics, Springer, vol. 39(2), pages 803-834, April.
    7. Carrizosa, Emilio & Halskov, Thomas & Romero Morales, Dolores, 2026. "Wasserstein support vector machine: Support vector machines made fair," European Journal of Operational Research, Elsevier, vol. 329(2), pages 641-652.
    8. Huang, Di & Wang, Haotian & Zhang, Jinyu & Wang, Hao & Liu, Zhiyuan, 2025. "Prescriptive analytics of electric bus battery allocation optimization based on the Plackett-Luce model," Transportation Research Part E: Logistics and Transportation Review, Elsevier, vol. 203(C).
    9. Jiameng Lyu & Jinxing Xie & Shilin Yuan & Yuan Zhou, 2025. "A Minibatch Stochastic Gradient Descent-Based Learning Metapolicy for Inventory Systems with Myopic Optimal Policy," Management Science, INFORMS, vol. 71(7), pages 5572-5588, July.
    10. Zhang, Yunfei & Li, Jian & Yu, Mingzhe & Chen, Xu & Chen, Xingying & Shen, Jun, 2025. "Dominant factor identification and fast optimization of carnot battery by integrating SHAP and physics-guided neural network," Applied Energy, Elsevier, vol. 401(PA).
    11. Blanquero, Rafael & Carrizosa, Emilio & Molero-Río, Cristina & Morales, Dolores Romero, 2022. "On sparse optimal regression trees," European Journal of Operational Research, Elsevier, vol. 299(3), pages 1045-1054.
    12. Xue, Gang & Gong, Daqing & Ren, Long & Cui, Ziruo, 2026. "Modeling expert risk assessments in utility tunnels with deep learning," Reliability Engineering and System Safety, Elsevier, vol. 265(PA).
    13. Kraus, Mathias & Tschernutter, Daniel & Weinzierl, Sven & Zschech, Patrick, 2024. "Interpretable generalized additive neural networks," European Journal of Operational Research, Elsevier, vol. 317(2), pages 303-316.
    14. Zineb El Filali Ech-Chafiq & Pierre Henry Labordère & Jérôme Lelong, 2023. "Pricing Bermudan options using regression trees/random forests," Post-Print hal-03436046, HAL.
    15. Jorge Sicacha-Parada & Diego Pavon-Jordan & Ingelin Steinsland & Roel May & Bård Stokke & Ingar Jostein Øien, 2022. "A Spatial Modeling Framework for Monitoring Surveys with Different Sampling Protocols with a Case Study for Bird Abundance in Mid-Scandinavia," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 27(3), pages 562-591, September.
    16. Diana Koldasbayeva & Polina Tregubova & Mikhail Gasanov & Alexey Zaytsev & Anna Petrovskaia & Evgeny Burnaev, 2024. "Challenges in data-driven geospatial modeling for environmental research and practice," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    17. Zehetner, Dominik & Gansterer, Margaretha, 2025. "Effective job reassignments in large scale collaborative additive manufacturing networks," International Journal of Production Economics, Elsevier, vol. 289(C).
    18. Sina Aghaei & Andrés Gómez & Phebe Vayanos, 2025. "Strong Optimal Classification Trees," Operations Research, INFORMS, vol. 73(4), pages 2223-2241, July.
    19. Tsionas, Mike, 2022. "Efficiency estimation using probabilistic regression trees with an application to Chilean manufacturing industries," International Journal of Production Economics, Elsevier, vol. 249(C).
    20. De la Cruz, Andrés & Numa, Catherine, 2024. "Habitat availability decline for waterbirds in a sensitive wetland: Climate change impact on the Ebro Delta," Ecological Modelling, Elsevier, vol. 498(C).

    More about this item

    Keywords

    ;
    ;
    ;
    ;

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:328:y:2026:i:2:p:607-619. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.