IDEAS home Printed from https://ideas.repec.org/p/bng/wpaper/19016.html
   My bibliography  Save this paper

Clustering, Forecasting and Cluster Forecasting: using k-medoids, k-NNs and random forests for cluster selection

Author

Listed:
  • Dinesh Reddy Vangumalli

    (Oracle America Inc)

  • Konstantinos Nikolopoulos

    (Bangor University)

  • Konstantia Litsiou

    (Manchester Metropolitan University)

Abstract

Data analysts when facing a forecasting task involving a large number of time series, they regularly employ one of the following two methodological approaches: either select a single forecasting method for the entire dataset (aggregate selection), or use the best forecasting method for each time series (individual selection). There is evidence in the predictive analytics literature that the former is more robust than the latter, as in individual selection you tend to overfit models to the data. A third approach is to firstly identify homogeneous clusters within the dataset, and then select a single forecasting method for each cluster (cluster selection). This research examines the performance of three well-celebrated machine learning clustering methods: k-medoids, k-NN and random forests. We then forecast every cluster with the best possible method, and the performance is compared to that of aggregate selection. The aforementioned methods are very often used for classification tasks, but since in our case there is no set of predefined classes, the methods are used for pure clustering. The evaluation is performed in the 645 yearly series of the M3 competition. The empirical evidence suggests that: a) random forests provide the best clusters for the sequential forecasting task, and b) cluster selection has the potential to outperform aggregate selection.

Suggested Citation

  • Dinesh Reddy Vangumalli & Konstantinos Nikolopoulos & Konstantia Litsiou, 2019. "Clustering, Forecasting and Cluster Forecasting: using k-medoids, k-NNs and random forests for cluster selection," Working Papers 19016, Bangor Business School, Prifysgol Bangor University (Cymru / Wales).
  • Handle: RePEc:bng:wpaper:19016
    as

    Download full text from publisher

    File URL: https://www.bangor.ac.uk/business/research/documents/BBSWP-19-16.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Fildes, Robert & Petropoulos, Fotios, 2015. "Simple versus complex selection rules for forecasting many time series," Journal of Business Research, Elsevier, vol. 68(8), pages 1692-1701.
    2. Taylor, James W., 2003. "Exponential smoothing with a damped multiplicative trend," International Journal of Forecasting, Elsevier, vol. 19(4), pages 715-725.
    3. Fred Collopy & J. Scott Armstrong, 1992. "Rule-Based Forecasting: Development and Validation of an Expert Systems Approach to Combining Time Series Extrapolations," Management Science, INFORMS, vol. 38(10), pages 1394-1414, October.
    4. Makridakis, Spyros & Chatfield, Chris & Hibon, Michele & Lawrence, Michael & Mills, Terence & Ord, Keith & Simmons, LeRoy F., 1993. "The M2-competition: A real-time judgmentally based forecasting study," International Journal of Forecasting, Elsevier, vol. 9(1), pages 5-22, April.
    5. Hyndman, Rob J. & Koehler, Anne B., 2006. "Another look at measures of forecast accuracy," International Journal of Forecasting, Elsevier, vol. 22(4), pages 679-688.
    6. Félix Iglesias & Wolfgang Kastner, 2013. "Analysis of Similarity Measures in Times Series Clustering for the Discovery of Building Energy Patterns," Energies, MDPI, vol. 6(2), pages 1-19, January.
    7. Makridakis, Spyros & Hibon, Michele, 2000. "The M3-Competition: results, conclusions and implications," International Journal of Forecasting, Elsevier, vol. 16(4), pages 451-476.
    8. Holt, Charles C., 2004. "Author's retrospective on 'Forecasting seasonals and trends by exponentially weighted moving averages'," International Journal of Forecasting, Elsevier, vol. 20(1), pages 11-13.
    9. Holt, Charles C., 2004. "Forecasting seasonals and trends by exponentially weighted moving averages," International Journal of Forecasting, Elsevier, vol. 20(1), pages 5-10.
    10. Nikolopoulos, K. & Goodwin, P. & Patelis, A. & Assimakopoulos, V., 2007. "Forecasting with cue information: A comparison of multiple regression with alternative forecasting approaches," European Journal of Operational Research, Elsevier, vol. 180(1), pages 354-368, July.
    11. Reilly, David, 2000. "The AUTOBOX system," International Journal of Forecasting, Elsevier, vol. 16(4), pages 531-533.
    12. Goldstein Benjamin A & Polley Eric C & Briggs Farren B. S., 2011. "Random Forests for Genetic Association Studies," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-34, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Nikolopoulos, Konstantinos & Punia, Sushil & Schäfers, Andreas & Tsinopoulos, Christos & Vasilakis, Chrysovalantis, 2021. "Forecasting and planning during a pandemic: COVID-19 growth rates, supply chain disruptions, and governmental decisions," European Journal of Operational Research, Elsevier, vol. 290(1), pages 99-115.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Petropoulos, Fotios & Apiletti, Daniele & Assimakopoulos, Vassilios & Babai, Mohamed Zied & Barrow, Devon K. & Ben Taieb, Souhaib & Bergmeir, Christoph & Bessa, Ricardo J. & Bijak, Jakub & Boylan, Joh, 2022. "Forecasting: theory and practice," International Journal of Forecasting, Elsevier, vol. 38(3), pages 705-871.
      • Fotios Petropoulos & Daniele Apiletti & Vassilios Assimakopoulos & Mohamed Zied Babai & Devon K. Barrow & Souhaib Ben Taieb & Christoph Bergmeir & Ricardo J. Bessa & Jakub Bijak & John E. Boylan & Jet, 2020. "Forecasting: theory and practice," Papers 2012.03854, arXiv.org, revised Jan 2022.
    2. Gardner, Everette Jr., 2006. "Exponential smoothing: The state of the art--Part II," International Journal of Forecasting, Elsevier, vol. 22(4), pages 637-666.
    3. Petropoulos, Fotios & Makridakis, Spyros & Assimakopoulos, Vassilios & Nikolopoulos, Konstantinos, 2014. "‘Horses for Courses’ in demand forecasting," European Journal of Operational Research, Elsevier, vol. 237(1), pages 152-163.
    4. Jan G. De Gooijer & Rob J. Hyndman, 2005. "25 Years of IIF Time Series Forecasting: A Selective Review," Monash Econometrics and Business Statistics Working Papers 12/05, Monash University, Department of Econometrics and Business Statistics.
    5. Svetunkov, Ivan & Kourentzes, Nikolaos, 2015. "Complex Exponential Smoothing," MPRA Paper 69394, University Library of Munich, Germany.
    6. Armstrong, J. Scott & Green, Kesten C. & Graefe, Andreas, 2015. "Golden rule of forecasting: Be conservative," Journal of Business Research, Elsevier, vol. 68(8), pages 1717-1731.
    7. Hyndman, Rob J. & Khandakar, Yeasmin, 2008. "Automatic Time Series Forecasting: The forecast Package for R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 27(i03).
    8. Meira, Erick & Cyrino Oliveira, Fernando Luiz & de Menezes, Lilian M., 2022. "Forecasting natural gas consumption using Bagging and modified regularization techniques," Energy Economics, Elsevier, vol. 106(C).
    9. Han, Weiwei & Wang, Xun & Petropoulos, Fotios & Wang, Jing, 2019. "Brain imaging and forecasting: Insights from judgmental model selection," Omega, Elsevier, vol. 87(C), pages 1-9.
    10. Green, Kesten C. & Armstrong, J. Scott, 2015. "Simple versus complex forecasting: The evidence," Journal of Business Research, Elsevier, vol. 68(8), pages 1678-1685.
    11. R Fildes & K Nikolopoulos & S F Crone & A A Syntetos, 2008. "Forecasting and operational research: a review," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 59(9), pages 1150-1172, September.
    12. Makridakis, Spyros & Hyndman, Rob J. & Petropoulos, Fotios, 2020. "Forecasting in social settings: The state of the art," International Journal of Forecasting, Elsevier, vol. 36(1), pages 15-28.
    13. Hill, Arthur V. & Zhang, Weiyong & Burch, Gerald F., 2015. "Forecasting the forecastability quotient for inventory management," International Journal of Forecasting, Elsevier, vol. 31(3), pages 651-663.
    14. repec:jss:jstsof:27:i03 is not listed on IDEAS
    15. Pantelis Agathangelou & Demetris Trihinas & Ioannis Katakis, 2020. "A Multi-Factor Analysis of Forecasting Methods: A Study on the M4 Competition," Data, MDPI, vol. 5(2), pages 1-24, April.
    16. Theodosiou, Marina, 2011. "Forecasting monthly and quarterly time series using STL decomposition," International Journal of Forecasting, Elsevier, vol. 27(4), pages 1178-1195, October.
    17. Gardner, Everette Shaw & Acar, Yavuz, 2016. "The forecastability quotient reconsidered," International Journal of Forecasting, Elsevier, vol. 32(4), pages 1208-1211.
    18. Spiliotis, Evangelos & Kouloumos, Andreas & Assimakopoulos, Vassilios & Makridakis, Spyros, 2020. "Are forecasting competitions data representative of the reality?," International Journal of Forecasting, Elsevier, vol. 36(1), pages 37-53.
    19. Erjiang E & Ming Yu & Xin Tian & Ye Tao, 2022. "Dynamic Model Selection Based on Demand Pattern Classification in Retail Sales Forecasting," Mathematics, MDPI, vol. 10(17), pages 1-16, September.
    20. Abolghasemi, Mahdi & Hurley, Jason & Eshragh, Ali & Fahimnia, Behnam, 2020. "Demand forecasting in the presence of systematic events: Cases in capturing sales promotions," International Journal of Production Economics, Elsevier, vol. 230(C).
    21. Kourentzes, Nikolaos & Petropoulos, Fotios & Trapero, Juan R., 2014. "Improving forecasting by estimating time series structural components across multiple frequencies," International Journal of Forecasting, Elsevier, vol. 30(2), pages 291-302.

    More about this item

    Keywords

    Clustering; k-medoids; Nearest Neighbors; Random Forests; Forecasting;
    All these keywords.

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bng:wpaper:19016. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Alan Thomas (email available below). General contact details of provider: https://edirc.repec.org/data/sabanuk.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.