IDEAS home Printed from https://ideas.repec.org/p/zbw/bubdps/552020.html
   My bibliography  Save this paper

A random forest-based approach to identifying the most informative seasonality tests

Author

Listed:
  • Ollech, Daniel
  • Webel, Karsten

Abstract

Virtually each seasonal adjustment software includes an ensemble of seasonality tests for assessing whether a given time series is in fact a candidate for seasonal adjustment. However, such tests are certain to produce either the same resultor conflicting results, raising the question if there is a method that is capable of identifying the most informative tests in order (1) to eliminate the seemingly non-informative ones in the former case and (2) to find a final decision in the more severe latter case. We argue that identifying the seasonal status of a given time series is essentially a classification problem and, thus, can be solved with machine learning methods. Using simulated seasonal and non-seasonal ARIMA processes that are representative of the Bundesbank's time series database, we compare certain popular methods with respect to accuracy, interpretability and availability of unbiased variable importance measures and find random forests of conditional inference trees to be the method which best balances these key requirements. Applying this method to the seasonality tests implemented in the seasonal adjustment software JDemetra+ finally reveals that the modifiedQSand Friedman tests yield by far the most informative results.

Suggested Citation

  • Ollech, Daniel & Webel, Karsten, 2020. "A random forest-based approach to identifying the most informative seasonality tests," Discussion Papers 55/2020, Deutsche Bundesbank.
  • Handle: RePEc:zbw:bubdps:552020
    as

    Download full text from publisher

    File URL: https://www.econstor.eu/bitstream/10419/225323/1/1736582380.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Archer, Kellie J. & Kimes, Ryan V., 2008. "Empirical characterization of random forest variable importance measures," Computational Statistics & Data Analysis, Elsevier, vol. 52(4), pages 2249-2260, January.
    2. Jean-Thomas Bernard & Nadhem Idoudi & Lynda Khalaf & Clément Yélou, 2007. "Finite sample inference methods for dynamic energy demand models," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 22(7), pages 1211-1226.
    3. Dufour, Jean-Marie, 2006. "Monte Carlo tests with nuisance parameters: A general approach to finite-sample inference and nonstandard asymptotics," Journal of Econometrics, Elsevier, vol. 133(2), pages 443-477, August.
    4. Bergamelli, Michele & Bianchi, Annamaria & Khalaf, Lynda & Urga, Giovanni, 2019. "Combining p-values to test for multiple structural breaks in cointegrated regressions," Journal of Econometrics, Elsevier, vol. 211(2), pages 461-482.
    5. Friedman, Jerome H., 2002. "Stochastic gradient boosting," Computational Statistics & Data Analysis, Elsevier, vol. 38(4), pages 367-378, February.
    6. Olivier J T Briët & Priyanie H Amerasinghe & Penelope Vounatsou, 2013. "Generalized Seasonal Autoregressive Integrated Moving Average Models for Count Data with Application to Malaria Time Series with Low Case Numbers," PLOS ONE, Public Library of Science, vol. 8(6), pages 1-9, June.
    7. Pinkwart, Nicolas, 2018. "Short-term forecasting economic activity in Germany: A supply and demand side system of bridge equations," Discussion Papers 36/2018, Deutsche Bundesbank.
    8. Lee, Tzu-Haw & Shih, Yu-Shan, 2006. "Unbiased variable selection for classification trees with multivariate responses," Computational Statistics & Data Analysis, Elsevier, vol. 51(2), pages 659-667, November.
    9. Ghysels,Eric & Osborn,Denise R., 2001. "The Econometric Analysis of Seasonal Time Series," Cambridge Books, Cambridge University Press, number 9780521565882, January.
    10. Hans Franses, Philip, 1992. "Testing for seasonality," Economics Letters, Elsevier, vol. 38(3), pages 259-262, March.
    11. Götz, Thomas B. & Hauzenberger, Klemens, 2018. "Large mixed-frequency VARs with a parsimonious time-varying parameter structure," Discussion Papers 40/2018, Deutsche Bundesbank.
    12. Christian Bayer & Christoph Hanck, 2013. "Combining non-cointegration tests," Journal of Time Series Analysis, Wiley Blackwell, vol. 34(1), pages 83-95, January.
    13. Kim H. & Loh W.Y., 2001. "Classification Trees With Unbiased Multiway Splits," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 589-604, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Dean Fantazzini & Julia Pushchelenko & Alexey Mironenkov & Alexey Kurbatskii, 2021. "Forecasting Internal Migration in Russia Using Google Trends: Evidence from Moscow and Saint Petersburg," Forecasting, MDPI, vol. 3(4), pages 1-30, October.
    2. Daniel Ollech & Deutsche Bundesbank, 2023. "Economic analysis using higher-frequency time series: challenges for seasonal adjustment," Empirical Economics, Springer, vol. 64(3), pages 1375-1398, March.
    3. Xiandeng Jiang & Le Chang & Yanlin Shi, 2023. "Housing price diffusions in mainland China: evidence from a spatially penalized graphical VAR model," Empirical Economics, Springer, vol. 64(2), pages 765-795, February.
    4. Twumasi, Clement & Twumasi, Juliet, 2022. "Machine learning algorithms for forecasting and backcasting blood demand data with missing values and outliers: A study of Tema General Hospital of Ghana," International Journal of Forecasting, Elsevier, vol. 38(3), pages 1258-1277.
    5. Panja, Madhurima & Chakraborty, Tanujit & Nadim, Sk Shahid & Ghosh, Indrajit & Kumar, Uttam & Liu, Nan, 2023. "An ensemble neural network approach to forecast Dengue outbreak based on climatic condition," Chaos, Solitons & Fractals, Elsevier, vol. 167(C).
    6. Bogdan Oancea & Richard Pospíšil & Marius Nicolae Jula & Cosmin-Ionuț Imbrișcă, 2021. "Experiments with Fuzzy Methods for Forecasting Time Series as Alternatives to Classical Methods," Mathematics, MDPI, vol. 9(19), pages 1-17, October.
    7. Ollech, Daniel, 2021. "Economic analysis using higher frequency time series: Challenges for seasonal adjustment," Discussion Papers 53/2021, Deutsche Bundesbank.
    8. Ersin Sünbül, 2023. "Linear and Nonlinear Relationship Between Real Exchange Rate, Real Interest Rate and Consumer Price Index: An Empirical Application for Countries with Different Levels of Development," Scientific Annals of Economics and Business (continues Analele Stiintifice), Alexandru Ioan Cuza University, Faculty of Economics and Business Administration, vol. 70(1), pages 57-70, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bernard, Jean-Thomas & Idoudi, Nadhem & Khalaf, Lynda & Yelou, Clement, 2007. "Finite sample multivariate structural change tests with application to energy demand models," Journal of Econometrics, Elsevier, vol. 141(2), pages 1219-1244, December.
    2. Emilio Carrizosa & Cristina Molero-Río & Dolores Romero Morales, 2021. "Mathematical optimization in classification and regression trees," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(1), pages 5-33, April.
    3. Bergamelli, Michele & Bianchi, Annamaria & Khalaf, Lynda & Urga, Giovanni, 2019. "Combining p-values to test for multiple structural breaks in cointegrated regressions," Journal of Econometrics, Elsevier, vol. 211(2), pages 461-482.
    4. Hapfelmeier, A. & Ulm, K., 2014. "Variable selection by Random Forests using data with missing values," Computational Statistics & Data Analysis, Elsevier, vol. 80(C), pages 129-139.
    5. Yicong Lin & Hanno Reuvers, 2020. "Cointegrating Polynomial Regressions with Power Law Trends: Environmental Kuznets Curve or Omitted Time Effects?," Papers 2009.02262, arXiv.org, revised Dec 2021.
    6. Rokach, Lior, 2009. "Taxonomy for characterizing ensemble methods in classification tasks: A review and annotated bibliography," Computational Statistics & Data Analysis, Elsevier, vol. 53(12), pages 4046-4072, October.
    7. Gérard Biau & Erwan Scornet, 2016. "A random forest guided tour," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 25(2), pages 197-227, June.
    8. Christophe Dutang & Quentin Guibert, 2021. "An explicit split point procedure in model-based trees allowing for a quick fitting of GLM trees and GLM forests," Post-Print hal-03448250, HAL.
    9. Webel, Karsten, 2016. "A data-driven selection of an appropriate seasonal adjustment approach," Discussion Papers 07/2016, Deutsche Bundesbank.
    10. Shahbaz, Muhammad & Hoang, Thi Hong Van & Mahalik, Mantu Kumar & Roubaud, David, 2017. "Energy consumption, financial development and economic growth in India: New evidence from a nonlinear and asymmetric analysis," Energy Economics, Elsevier, vol. 63(C), pages 199-212.
    11. Strobl, Carolin & Boulesteix, Anne-Laure & Augustin, Thomas, 2007. "Unbiased split selection for classification trees based on the Gini Index," Computational Statistics & Data Analysis, Elsevier, vol. 52(1), pages 483-501, September.
    12. Christoffersen, Peter & Ghysels, Eric & Swanson, Norman R., 2002. "Let's get "real" about using economic data," Journal of Empirical Finance, Elsevier, vol. 9(3), pages 343-360, August.
    13. Xilong Chen & Eric Ghysels, 2011. "News--Good or Bad--and Its Impact on Volatility Predictions over Multiple Horizons," Review of Financial Studies, Society for Financial Studies, vol. 24(1), pages 46-81, October.
    14. Franses, Philip Hans, 2013. "Data revisions and periodic properties of macroeconomic data," Economics Letters, Elsevier, vol. 120(2), pages 139-141.
    15. Takahashi, Makoto & Watanabe, Toshiaki & Omori, Yasuhiro, 2016. "Volatility and quantile forecasts by realized stochastic volatility models with generalized hyperbolic distribution," International Journal of Forecasting, Elsevier, vol. 32(2), pages 437-457.
    16. Chambers, Marcus J. & Ercolani, Joanne S. & Taylor, A.M. Robert, 2014. "Testing for seasonal unit roots by frequency domain regression," Journal of Econometrics, Elsevier, vol. 178(P2), pages 243-258.
    17. Liu, Yaping & Sadiq, Farah & Ali, Wajahat & Kumail, Tafazal, 2022. "Does tourism development, energy consumption, trade openness and economic growth matters for ecological footprint: Testing the Environmental Kuznets Curve and pollution haven hypothesis for Pakistan," Energy, Elsevier, vol. 245(C).
    18. Roberto Cellini & Tiziana Cuccia, 2013. "Museum and monument attendance and tourism flow: a time series analysis approach," Applied Economics, Taylor & Francis Journals, vol. 45(24), pages 3473-3482, August.
    19. Bissan Ghaddar & Ignacio Gómez-Casares & Julio González-Díaz & Brais González-Rodríguez & Beatriz Pateiro-López & Sofía Rodríguez-Ballesteros, 2023. "Learning for Spatial Branching: An Algorithm Selection Approach," INFORMS Journal on Computing, INFORMS, vol. 35(5), pages 1024-1043, September.
    20. Zhenkai Yang & Mei-Chih Wang & Tsangyao Chang & Wing-Keung Wong & Fangjhy Li, 2022. "Which Factors Determine CO 2 Emissions in China? Trade Openness, Financial Development, Coal Consumption, Economic Growth or Urbanization: Quantile Granger Causality Test," Energies, MDPI, vol. 15(7), pages 1-18, March.

    More about this item

    Keywords

    binary classification; conditional inference trees; correlated predictors; JDemetra+; simulation study; supervised machine learning;
    All these keywords.

    JEL classification:

    • C12 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Hypothesis Testing: General
    • C14 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Semiparametric and Nonparametric Methods: General
    • C22 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Time-Series Models; Dynamic Quantile Regressions; Dynamic Treatment Effect Models; Diffusion Processes
    • C45 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods: Special Topics - - - Neural Networks and Related Topics
    • C63 - Mathematical and Quantitative Methods - - Mathematical Methods; Programming Models; Mathematical and Simulation Modeling - - - Computational Techniques

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:zbw:bubdps:552020. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ZBW - Leibniz Information Centre for Economics (email available below). General contact details of provider: https://edirc.repec.org/data/dbbgvde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.