IDEAS home Printed from https://ideas.repec.org/a/gam/jstats/v8y2025i2p36-d1652391.html
   My bibliography  Save this article

Unraveling Meteorological Dynamics: A Two-Level Clustering Algorithm for Time Series Pattern Recognition with Missing Data Handling

Author

Listed:
  • Ekaterini Skamnia

    (Department of Civil Engineering, University of Patras, 265 04 Patras, Greece
    These authors contributed equally to this work.)

  • Eleni S. Bekri

    (Department of Civil Engineering, University of Patras, 265 04 Patras, Greece
    These authors contributed equally to this work.)

  • Polychronis Economou

    (Department of Civil Engineering, University of Patras, 265 04 Patras, Greece)

Abstract

Identifying regions with similar meteorological features is of both socioeconomic and ecological importance. Towards that direction, useful information can be drawn from meteorological stations, and spread in a broader area. In this work, a time series clustering procedure composed of two levels is proposed, focusing on clustering spatial units (meteorological stations) based on their temporal patterns, rather than clustering time periods. It is capable of handling univariate or multivariate time series, with missing data or different lengths but with a common seasonal time period. The first level involves the clustering of the dominant features of the time series (e.g., similar seasonal patterns) by employing K-means, while the second one produces clusters based on secondary features. Hierarchical clustering with Dynamic Time Warping for the univariate case and multivariate Dynamic Time Warping for the multivariate scenario are employed for the second level. Principal component analysis or Classic Multidimensional Scaling is applied before the first level, while an imputation technique is applied to the raw data in the second level to address missing values in the dataset. This step is particularly important given that missing data is a frequent issue in measurements obtained from meteorological stations. The method is subsequently applied to the available precipitation time series and then also to a time series of mean temperature obtained by the automated weather stations network in Greece. Further, both of the characteristics are employed to cover the multivariate scenario.

Suggested Citation

  • Ekaterini Skamnia & Eleni S. Bekri & Polychronis Economou, 2025. "Unraveling Meteorological Dynamics: A Two-Level Clustering Algorithm for Time Series Pattern Recognition with Missing Data Handling," Stats, MDPI, vol. 8(2), pages 1-39, May.
  • Handle: RePEc:gam:jstats:v:8:y:2025:i:2:p:36-:d:1652391
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2571-905X/8/2/36/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2571-905X/8/2/36/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Godsill, Simon J. & Doucet, Arnaud & West, Mike, 2004. "Monte Carlo Smoothing for Nonlinear Time Series," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 156-168, January.
    2. Yanhui Liu & Pierre Cizeau & Martin Meyer & Chung-Kang Peng & H. Eugene Stanley, 1997. "Correlations in Economic Time Series," Papers cond-mat/9706021, arXiv.org.
    3. Bekri, Eleni S. & Kokkoris, Ioannis P. & Skuras, Dimitrios & Hein, Lars & Dimopoulos, Panayotis, 2024. "Ecosystem accounting for water resources at the catchment scale, a case study for the Peloponnisos, Greece," Ecosystem Services, Elsevier, vol. 65(C).
    4. Mohammad Samsul Alam & Sangita Paul, 2020. "A comparative analysis of clustering algorithms to identify the homogeneous rainfall gauge stations of Bangladesh," Journal of Applied Statistics, Taylor & Francis Journals, vol. 47(8), pages 1460-1481, June.
    5. Robert Tibshirani & Guenther Walther & Trevor Hastie, 2001. "Estimating the number of clusters in a data set via the gap statistic," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 63(2), pages 411-423.
    6. Harvey, Andrew C. & Trimbur, Thomas M. & Van Dijk, Herman K., 2007. "Trends and cycles in economic time series: A Bayesian approach," Journal of Econometrics, Elsevier, vol. 140(2), pages 618-649, October.
    7. J. Kruskal, 1964. "Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis," Psychometrika, Springer;The Psychometric Society, vol. 29(1), pages 1-27, March.
    8. Kyriakos Skarlatos & Eleni S. Bekri & Dimitrios Georgakellos & Polychronis Economou & Sotirios Bersimis, 2023. "Projecting Annual Rainfall Timeseries Using Machine Learning Techniques," Energies, MDPI, vol. 16(3), pages 1-20, February.
    9. Liu, Yanhui & Cizeau, Pierre & Meyer, Martin & Peng, C.-K. & Eugene Stanley, H., 1997. "Correlations in economic time series," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 245(3), pages 437-440.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Muchnik, Lev & Bunde, Armin & Havlin, Shlomo, 2009. "Long term memory in extreme returns of financial time series," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 388(19), pages 4145-4150.
    2. Anirban Chakraborti & Ioane Muni Toke & Marco Patriarca & Frederic Abergel, 2011. "Econophysics review: I. Empirical facts," Quantitative Finance, Taylor & Francis Journals, vol. 11(7), pages 991-1012.
    3. Danilo Delpini & Giacomo Bormetti, 2012. "Stochastic Volatility with Heterogeneous Time Scales," Papers 1206.0026, arXiv.org, revised Apr 2013.
    4. Martin D. Gould & Mason A. Porter & Stacy Williams & Mark McDonald & Daniel J. Fenn & Sam D. Howison, 2010. "Limit Order Books," Papers 1012.0349, arXiv.org, revised Apr 2013.
    5. Perini de Souza, Noéle Bissoli & Cardoso dos Santos, José Vicente & Sperandio Nascimento, Erick Giovani & Bandeira Santos, Alex Alisson & Moreira, Davidson Martins, 2022. "Long-range correlations of the wind speed in a northeast region of Brazil," Energy, Elsevier, vol. 243(C).
    6. Zhang, Hong-Yan & Kang, Ming-Cui & Li, Jing-Qiang & Liu, Hai-Tao, 2017. "R/S analysis of reaction time in Neuron Type Test for human activity in civil aviation," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 469(C), pages 859-870.
    7. Sidorov, S.P. & Faizliev, A.R. & Balash, V.A. & Korobov, E.A., 2016. "Long-range correlation analysis of economic news flow intensity," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 444(C), pages 205-212.
    8. Andrea Giuseppe Di Iura & Giulia Terenzi, 2021. "A Bayesian analysis of gain-loss asymmetry," Papers 2104.06044, arXiv.org.
    9. Paulo Ferreira & Éder J. A. L. Pereira & Hernane B. B. Pereira, 2020. "The Exposure of European Union Productive Sectors to Oil Price Changes," Sustainability, MDPI, vol. 12(4), pages 1-16, February.
    10. Gu, Gao-Feng & Zhou, Wei-Xing, 2007. "Statistical properties of daily ensemble variables in the Chinese stock markets," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 383(2), pages 497-506.
    11. Thilo A. Schmitt & Rudi Schafer & Holger Dette & Thomas Guhr, 2015. "Quantile Correlations: Uncovering temporal dependencies in financial time series," Papers 1507.04990, arXiv.org.
    12. Lisa Borland & Jean-Philippe Bouchaud & Jean-Francois Muzy & Gilles Zumbach, 2005. "The Dynamics of Financial Markets -- Mandelbrot's multifractal cascades, and beyond," Science & Finance (CFM) working paper archive 500061, Science & Finance, Capital Fund Management.
    13. Mariani, M.C. & Florescu, I. & Beccar Varela, M.P. & Ncheuguim, E., 2010. "Study of memory effects in international market indices," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 389(8), pages 1653-1664.
    14. Alvarez-Ramirez, J. & Ibarra-Valdez, C. & Rodriguez, E. & Urrea, R., 2007. "Fractality and time correlation in contemporary war," Chaos, Solitons & Fractals, Elsevier, vol. 34(4), pages 1039-1049.
    15. Michele Caraglio & Fulvio Baldovin & Attilio L. Stella, 2021. "How Fast Does the Clock of Finance Run?—A Time-Definition Enforcing Stationarity and Quantifying Overnight Duration," JRFM, MDPI, vol. 14(8), pages 1-15, August.
    16. Nunes Amaral, Luís A & Buldyrev, Sergey V & Havlin, Shlomo & Maass, Philipp & Salinger, Michael A & Eugene Stanley, H & Stanley, Michael H.R, 1997. "Scaling behavior in economics: The problem of quantifying company growth," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 244(1), pages 1-24.
    17. Holdom, B, 1998. "From turbulence to financial time series," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 254(3), pages 569-576.
    18. M. Fern'andez-Mart'inez & M. A S'anchez-Granero & Mar'ia Jos'e Mu~noz Torrecillas & Bill McKelvey, 2016. "A comparison among some Hurst exponent approaches to predict nascent bubbles in $500$ company stocks," Papers 1601.04188, arXiv.org.
    19. Weron, Rafal & Weron, Karina & Weron, Aleksander, 1999. "A conditionally exponential decay approach to scaling in finance," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 264(3), pages 551-561.
    20. Mariani, M.C. & Florescu, I. & Beccar Varela, M.P. & Ncheuguim, E., 2009. "Long correlations and Levy models applied to the study of memory effects in high frequency (tick) data," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 388(8), pages 1659-1664.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jstats:v:8:y:2025:i:2:p:36-:d:1652391. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.