IDEAS home Printed from https://ideas.repec.org/a/eee/proeco/v279y2025ics0925527324003062.html
   My bibliography  Save this article

Scalable probabilistic forecasting in retail with gradient boosted trees: A practitioner’s approach

Author

Listed:
  • Long, Xueying
  • Bui, Quang
  • Oktavian, Grady
  • Schmidt, Daniel F.
  • Bergmeir, Christoph
  • Godahewa, Rakshitha
  • Lee, Seong Per
  • Zhao, Kaifeng
  • Condylis, Paul

Abstract

The recent M5 competition has advanced the state-of-the-art in retail forecasting. However, there are important differences between the competition challenge and the challenges we face in a large e-commerce company. The datasets in our scenario are larger (hundreds of thousands of time series), and e-commerce can afford to have a larger stock assortment than brick-and-mortar retailers, leading to more intermittent data. To scale to larger dataset sizes with feasible computational effort, we investigate a two-layer hierarchy, namely the decision level with product unit sales and an aggregated level, e.g., through warehouse-product aggregation, reducing the number of series and degree of intermittency. We propose a top-down approach to forecasting at the aggregated level, and then disaggregate to obtain decision-level forecasts. Probabilistic forecasts are generated under distributional assumptions. The proposed scalable method is evaluated on both a large proprietary dataset, as well as the publicly available Corporación Favorita and M5 datasets. We are able to show the differences in characteristics of the e-commerce and brick-and-mortar retail datasets. Notably, our top-down forecasting framework enters the top 50 of the original M5 competition, even with models trained at a higher level under a much simpler setting.

Suggested Citation

  • Long, Xueying & Bui, Quang & Oktavian, Grady & Schmidt, Daniel F. & Bergmeir, Christoph & Godahewa, Rakshitha & Lee, Seong Per & Zhao, Kaifeng & Condylis, Paul, 2025. "Scalable probabilistic forecasting in retail with gradient boosted trees: A practitioner’s approach," International Journal of Production Economics, Elsevier, vol. 279(C).
  • Handle: RePEc:eee:proeco:v:279:y:2025:i:c:s0925527324003062
    DOI: 10.1016/j.ijpe.2024.109449
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0925527324003062
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ijpe.2024.109449?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. A A Syntetos & J E Boylan & J D Croston, 2005. "On the categorization of demand patterns," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 56(5), pages 495-503, May.
    2. Kourentzes, Nikolaos & Trapero, Juan R. & Barrow, Devon K., 2020. "Optimising forecasting models for inventory planning," International Journal of Production Economics, Elsevier, vol. 225(C).
    3. Montero-Manso, Pablo & Hyndman, Rob J., 2021. "Principles and algorithms for forecasting groups of time series: Locality and globality," International Journal of Forecasting, Elsevier, vol. 37(4), pages 1632-1653.
    4. Zeileis, Achim & Kleiber, Christian & Jackman, Simon, 2008. "Regression Models for Count Data in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 27(i08).
    5. Gneiting, Tilmann, 2011. "Quantiles as optimal point forecasts," International Journal of Forecasting, Elsevier, vol. 27(2), pages 197-207.
    6. Bojer, Casper Solheim & Meldgaard, Jens Peder, 2021. "Kaggle forecasting competitions: An overlooked learning opportunity," International Journal of Forecasting, Elsevier, vol. 37(2), pages 587-603.
    7. Fildes, Robert & Kolassa, Stephan & Ma, Shaohui, 2022. "Post-script—Retail forecasting: Research and practice," International Journal of Forecasting, Elsevier, vol. 38(4), pages 1319-1324.
    8. Makridakis, Spyros & Spiliotis, Evangelos & Assimakopoulos, Vassilios, 2022. "M5 accuracy competition: Results, findings, and conclusions," International Journal of Forecasting, Elsevier, vol. 38(4), pages 1346-1364.
    9. Salinas, David & Flunkert, Valentin & Gasthaus, Jan & Januschowski, Tim, 2020. "DeepAR: Probabilistic forecasting with autoregressive recurrent networks," International Journal of Forecasting, Elsevier, vol. 36(3), pages 1181-1191.
    10. Zhou, Chenxi & Viswanathan, S., 2011. "Comparison of a new bootstrapping method with parametric approaches for safety stock determination in service parts inventory systems," International Journal of Production Economics, Elsevier, vol. 133(1), pages 481-485, September.
    11. Syntetos, Aris A. & Boylan, John E., 2005. "The accuracy of intermittent demand estimates," International Journal of Forecasting, Elsevier, vol. 21(2), pages 303-314.
    12. Fildes, Robert & Ma, Shaohui & Kolassa, Stephan, 2022. "Retail forecasting: Research and practice," International Journal of Forecasting, Elsevier, vol. 38(4), pages 1283-1318.
    13. Stasinopoulos, D. Mikis & Rigby, Robert A., 2007. "Generalized Additive Models for Location Scale and Shape (GAMLSS) in R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 23(i07).
    14. Kolassa, Stephan, 2022. "Commentary on the M5 forecasting competition," International Journal of Forecasting, Elsevier, vol. 38(4), pages 1562-1568.
    15. Souhaib Ben Taieb & James W. Taylor & Rob J. Hyndman, 2017. "Coherent Probabilistic Forecasts for Hierarchical Time Series," Monash Econometrics and Business Statistics Working Papers 3/17, Monash University, Department of Econometrics and Business Statistics.
    16. Manuel Kunz & Stefan Birr & Mones Raslan & Lei Ma & Tim Januschowski, 2023. "Deep Learning Based Forecasting: A Case Study from the Online Fashion Industry," Palgrave Advances in Economics of Innovation and Technology, in: Mohsen Hamoudia & Spyros Makridakis & Evangelos Spiliotis (ed.), Forecasting with Artificial Intelligence, chapter 0, pages 279-311, Palgrave Macmillan.
    17. Rego, José Roberto do & Mesquita, Marco Aurélio de, 2015. "Demand forecasting and inventory control: A simulation study on automotive spare parts," International Journal of Production Economics, Elsevier, vol. 161(C), pages 1-16.
    18. Hyndman, Rob J. & Ahmed, Roman A. & Athanasopoulos, George & Shang, Han Lin, 2011. "Optimal combination forecasts for hierarchical time series," Computational Statistics & Data Analysis, Elsevier, vol. 55(9), pages 2579-2589, September.
    19. Snyder, Ralph D. & Ord, J. Keith & Beaumont, Adrian, 2012. "Forecasting the intermittent demand for slow-moving inventories: A modelling approach," International Journal of Forecasting, Elsevier, vol. 28(2), pages 485-496.
    20. Cragg, John G, 1971. "Some Statistical Models for Limited Dependent Variables with Application to the Demand for Durable Goods," Econometrica, Econometric Society, vol. 39(5), pages 829-844, September.
    21. Simon, Noah & Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2011. "Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 39(i05).
    22. Gneiting, Tilmann, 2011. "Quantiles as optimal point forecasts," International Journal of Forecasting, Elsevier, vol. 27(2), pages 197-207, April.
    23. Syntetos, Aris A. & Zied Babai, M. & Gardner, Everette S., 2015. "Forecasting intermittent inventory demands: simple parametric methods vs. bootstrapping," Journal of Business Research, Elsevier, vol. 68(8), pages 1746-1752.
    24. Hyndman, Rob J. & Koehler, Anne B., 2006. "Another look at measures of forecast accuracy," International Journal of Forecasting, Elsevier, vol. 22(4), pages 679-688.
    25. Hasni, M. & Aguir, M.S. & Babai, M.Z. & Jemai, Z., 2019. "On the performance of adjusted bootstrapping methods for intermittent demand forecasting," International Journal of Production Economics, Elsevier, vol. 216(C), pages 145-153.
    26. Spiliotis, Evangelos & Makridakis, Spyros & Kaltsounis, Anastasios & Assimakopoulos, Vassilios, 2021. "Product sales probabilistic forecasting: An empirical evaluation using the M5 competition data," International Journal of Production Economics, Elsevier, vol. 240(C).
    27. Willemain, Thomas R. & Smart, Charles N. & Schwarz, Henry F., 2004. "A new approach to forecasting intermittent demand for service parts inventories," International Journal of Forecasting, Elsevier, vol. 20(3), pages 375-387.
    28. Narendra Agrawal & Stephen A. Smith, 1996. "Estimating negative binomial demand for retail inventory management with unobservable lost sales," Naval Research Logistics (NRL), John Wiley & Sons, vol. 43(6), pages 839-861, September.
    29. Heinen, Andreas, 2003. "Modelling Time Series Count Data: An Autoregressive Conditional Poisson Model," MPRA Paper 8113, University Library of Munich, Germany.
    30. Marie Laure Delignette-Muller & Christophe Dutang, 2015. "fitdistrplus : An R Package for Fitting Distributions," Post-Print hal-01616147, HAL.
    31. Koenker, Roger W & Bassett, Gilbert, Jr, 1978. "Regression Quantiles," Econometrica, Econometric Society, vol. 46(1), pages 33-50, January.
    32. HEINEN, Andreas & RENGIFO, Erick, 2003. "Multivariate modelling of time series count data: an autoregressive conditional Poisson model," LIDAM Discussion Papers CORE 2003025, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    33. Kolassa, Stephan, 2016. "Evaluating predictive count data distributions in retail sales forecasting," International Journal of Forecasting, Elsevier, vol. 32(3), pages 788-803.
    34. Delignette-Muller, Marie Laure & Dutang, Christophe, 2015. "fitdistrplus: An R Package for Fitting Distributions," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 64(i04).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wang, Shengjie & Kang, Yanfei & Petropoulos, Fotios, 2024. "Combining probabilistic forecasts of intermittent demand," European Journal of Operational Research, Elsevier, vol. 315(3), pages 1038-1048.
    2. Petropoulos, Fotios & Apiletti, Daniele & Assimakopoulos, Vassilios & Babai, Mohamed Zied & Barrow, Devon K. & Ben Taieb, Souhaib & Bergmeir, Christoph & Bessa, Ricardo J. & Bijak, Jakub & Boylan, Joh, 2022. "Forecasting: theory and practice," International Journal of Forecasting, Elsevier, vol. 38(3), pages 705-871.
      • Fotios Petropoulos & Daniele Apiletti & Vassilios Assimakopoulos & Mohamed Zied Babai & Devon K. Barrow & Souhaib Ben Taieb & Christoph Bergmeir & Ricardo J. Bessa & Jakub Bijak & John E. Boylan & Jet, 2020. "Forecasting: theory and practice," Papers 2012.03854, arXiv.org, revised Jan 2022.
    3. Kourentzes, Nikolaos & Athanasopoulos, George, 2021. "Elucidate structure in intermittent demand series," European Journal of Operational Research, Elsevier, vol. 288(1), pages 141-152.
    4. Makridakis, Spyros & Spiliotis, Evangelos & Assimakopoulos, Vassilios & Chen, Zhi & Gaba, Anil & Tsetlin, Ilia & Winkler, Robert L., 2022. "The M5 uncertainty competition: Results, findings and conclusions," International Journal of Forecasting, Elsevier, vol. 38(4), pages 1365-1385.
    5. Fildes, Robert & Ma, Shaohui & Kolassa, Stephan, 2022. "Retail forecasting: Research and practice," International Journal of Forecasting, Elsevier, vol. 38(4), pages 1283-1318.
    6. Pinçe, Çerağ & Turrini, Laura & Meissner, Joern, 2021. "Intermittent demand forecasting for spare parts: A Critical review," Omega, Elsevier, vol. 105(C).
    7. Makridakis, Spyros & Spiliotis, Evangelos & Assimakopoulos, Vassilios, 2022. "M5 accuracy competition: Results, findings, and conclusions," International Journal of Forecasting, Elsevier, vol. 38(4), pages 1346-1364.
    8. Schlaich, Tim & Hoberg, Kai, 2024. "When is the next order? Nowcasting channel inventories with Point-of-Sales data to predict the timing of retail orders," European Journal of Operational Research, Elsevier, vol. 315(1), pages 35-49.
    9. Wellens, Arnoud P. & Boute, Robert N. & Udenio, Maximiliano, 2024. "Simplifying tree-based methods for retail sales forecasting with explanatory variables," European Journal of Operational Research, Elsevier, vol. 314(2), pages 523-539.
    10. Theodorou, Evangelos & Spiliotis, Evangelos & Assimakopoulos, Vassilios, 2025. "Forecast accuracy and inventory performance: Insights on their relationship from the M5 competition data," European Journal of Operational Research, Elsevier, vol. 322(2), pages 414-426.
    11. Mariusz Doszyn, 2020. "Accuracy of Intermittent Demand Forecasting Systems in the Enterprise," European Research Studies Journal, European Research Studies Journal, vol. 0(4), pages 912-930.
    12. Evangelos Spiliotis & Spyros Makridakis & Artemios-Anargyros Semenoglou & Vassilios Assimakopoulos, 2022. "Comparison of statistical and machine learning methods for daily SKU demand forecasting," Operational Research, Springer, vol. 22(3), pages 3037-3061, July.
    13. Marco Zanotti, 2025. "Do global forecasting models require frequent retraining?," Working Papers 551, University of Milano-Bicocca, Department of Economics.
    14. Kolassa, Stephan, 2016. "Evaluating predictive count data distributions in retail sales forecasting," International Journal of Forecasting, Elsevier, vol. 32(3), pages 788-803.
    15. Fildes, Robert & Ma, Shaohui & Kolassa, Stephan, 2019. "Retail forecasting: research and practice," MPRA Paper 89356, University Library of Munich, Germany.
    16. Jože Martin Rožanec & Blaž Fortuna & Dunja Mladenić, 2022. "Reframing Demand Forecasting: A Two-Fold Approach for Lumpy and Intermittent Demand," Sustainability, MDPI, vol. 14(15), pages 1-21, July.
    17. Costantino, Francesco & Di Gravio, Giulio & Patriarca, Riccardo & Petrella, Lea, 2018. "Spare parts management for irregular demand items," Omega, Elsevier, vol. 81(C), pages 57-66.
    18. Sarlo, Rodrigo & Fernandes, Cristiano & Borenstein, Denis, 2023. "Lumpy and intermittent retail demand forecasts with score-driven models," European Journal of Operational Research, Elsevier, vol. 307(3), pages 1146-1160.
    19. Ducharme, Corey & Agard, Bruno & Trépanier, Martin, 2021. "Forecasting a customer's Next Time Under Safety Stock," International Journal of Production Economics, Elsevier, vol. 234(C).
    20. Hu, Qiwei & Boylan, John E. & Chen, Huijing & Labib, Ashraf, 2018. "OR in spare parts management: A review," European Journal of Operational Research, Elsevier, vol. 266(2), pages 395-414.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:proeco:v:279:y:2025:i:c:s0925527324003062. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/ijpe .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.