IDEAS home Printed from https://ideas.repec.org/a/spr/jclass/v35y2018i1d10.1007_s00357-018-9250-5.html
   My bibliography  Save this article

The Hierarchical Spectral Merger Algorithm: A New Time Series Clustering Procedure

Author

Listed:
  • Carolina Euán

    (Centro de Investigación en Matemáticas
    King Abdullah University of Science and Technology
    University of California)

  • Hernando Ombao

    (King Abdullah University of Science and Technology
    University of California
    University of California, Irvine)

  • Joaquín Ortega

    (Centro de Investigación en Matemáticas)

Abstract

We present a new method for time series clustering which we call the Hierarchical Spectral Merger (HSM) method. This procedure is based on the spectral theory of time series and identifies series that share similar oscillations or waveforms. The extent of similarity between a pair of time series is measured using the total variation distance between their estimated spectral densities. At each step of the algorithm, every time two clusters merge, a new spectral density is estimated using the whole information present in both clusters, which is representative of all the series in the new cluster. The method is implemented in an R package HSMClust. We present two applications of the HSM method, one to data coming from wave-height measurements in oceanography and the other to electroencefalogram (EEG) data.

Suggested Citation

  • Carolina Euán & Hernando Ombao & Joaquín Ortega, 2018. "The Hierarchical Spectral Merger Algorithm: A New Time Series Clustering Procedure," Journal of Classification, Springer;The Classification Society, vol. 35(1), pages 71-99, April.
  • Handle: RePEc:spr:jclass:v:35:y:2018:i:1:d:10.1007_s00357-018-9250-5
    DOI: 10.1007/s00357-018-9250-5
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00357-018-9250-5
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00357-018-9250-5?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Maharaj, Elizabeth Ann, 2002. "Comparison of non-stationary time series in the frequency domain," Computational Statistics & Data Analysis, Elsevier, vol. 40(1), pages 131-141, July.
    2. Caiado, Jorge & Crato, Nuno & Pena, Daniel, 2006. "A periodogram-based metric for time series classification," Computational Statistics & Data Analysis, Elsevier, vol. 50(10), pages 2668-2684, June.
    3. Robert T. Krafty, 2016. "Discriminant Analysis of Time Series in the Presence of Within-Group Spectral Variability," Journal of Time Series Analysis, Wiley Blackwell, vol. 37(4), pages 435-450, July.
    4. Caiado, Jorge & Crato, Nuno & Peña, Daniel, 2009. "Comparison of time series with unequal length in the frequency domain," MPRA Paper 15310, University Library of Munich, Germany.
    5. Montero, Pablo & Vilar, José A., 2014. "TSclust: An R Package for Time Series Clustering," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 62(i01).
    6. Sonia Díaz & José Vilar, 2010. "Comparing Several Parametric and Nonparametric Approaches to Time Series Clustering: A Simulation Study," Journal of Classification, Springer;The Classification Society, vol. 27(3), pages 333-362, November.
    7. Robert Tibshirani & Guenther Walther & Trevor Hastie, 2001. "Estimating the number of clusters in a data set via the gap statistic," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 63(2), pages 411-423.
    8. Jens-Peter Kreiss & Efstathios Paparoditis, 2015. "Bootstrapping locally stationary processes," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 77(1), pages 267-290, January.
    9. Maharaj, Elizabeth A. & Alonso, Andres M., 2007. "Discrimination of locally stationary time series using wavelets," Computational Statistics & Data Analysis, Elsevier, vol. 52(2), pages 879-895, October.
    10. Elizabeth Ann Maharaj & Pierpaolo D’Urso & Don Galagedera, 2010. "Wavelet-based Fuzzy Clustering of Time Series," Journal of Classification, Springer;The Classification Society, vol. 27(2), pages 231-275, September.
    11. Maharaj, Elizabeth Ann & Alonso, Andrés M., 2014. "Discriminant analysis of multivariate time series: Application to diagnosis based on ECG signals," Computational Statistics & Data Analysis, Elsevier, vol. 70(C), pages 67-87.
    12. Pedro C. Alvarez‐Esteban & Carolina Euán & Joaquín Ortega, 2016. "Time series clustering using the total variation distance with applications in oceanography," Environmetrics, John Wiley & Sons, Ltd., vol. 27(6), pages 355-369, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Marc G. Genton & Ying Sun, 2019. "Comments on: Data science, big data and statistics," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(2), pages 338-341, June.
    2. Xu Gao & Weining Shen & Liwen Zhang & Jianhua Hu & Norbert J. Fortin & Ron D. Frostig & Hernando Ombao, 2021. "Regularized matrix data clustering and its application to image analysis," Biometrics, The International Biometric Society, vol. 77(3), pages 890-902, September.
    3. Benny Ren & Ian Barnett, 2022. "Autoregressive mixture models for clustering time series," Journal of Time Series Analysis, Wiley Blackwell, vol. 43(6), pages 918-937, November.
    4. Terrazas-Santamaria Diana & Mendoza-Palacios Saul & Berasaluce-Iza Julen, 2023. "An Alternative Approach to Frequency of Patent Technology Codes: The Case of Renewable Energy Generation," Economics - The Open-Access, Open-Assessment Journal, De Gruyter, vol. 17(1), pages 1-14, January.
    5. Tianbo Chen & Ying Sun & Carolina Euan & Hernando Ombao, 2021. "Clustering Brain Signals: a Robust Approach Using Functional Data Ranking," Journal of Classification, Springer;The Classification Society, vol. 38(3), pages 425-442, October.
    6. Embleton, Jonathan & Knight, Marina I. & Ombao, Hernando, 2022. "Wavelet testing for a replicate-effect within an ordered multiple-trial experiment," Computational Statistics & Data Analysis, Elsevier, vol. 174(C).
    7. Dai, Ning & Jones, Galin L. & Fiecas, Mark, 2020. "Bayesian longitudinal spectral estimation with application to resting-state fMRI data analysis," Econometrics and Statistics, Elsevier, vol. 15(C), pages 104-116.
    8. Douglas L. Steinley, 2018. "Editorial," Journal of Classification, Springer;The Classification Society, vol. 35(2), pages 195-197, July.
    9. Douglas L. Steinley, 2018. "Editorial," Journal of Classification, Springer;The Classification Society, vol. 35(3), pages 391-393, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Pierpaolo D’Urso & Livia Giovanni & Riccardo Massari & Dario Lallo, 2013. "Noise fuzzy clustering of time series by autoregressive metric," METRON, Springer;Sapienza Università di Roma, vol. 71(3), pages 217-243, November.
    2. Sipan Aslan & Ceylan Yozgatligil & Cem Iyigun, 2018. "Temporal clustering of time series via threshold autoregressive models: application to commodity prices," Annals of Operations Research, Springer, vol. 260(1), pages 51-77, January.
    3. B. Lafuente-Rego & P. D’Urso & J. A. Vilar, 2020. "Robust fuzzy clustering based on quantile autocovariances," Statistical Papers, Springer, vol. 61(6), pages 2393-2448, December.
    4. João A. Bastos & Jorge Caiado, 2014. "Clustering financial time series with variance ratio statistics," Quantitative Finance, Taylor & Francis Journals, vol. 14(12), pages 2121-2133, December.
    5. Beibei Zhang & Rong Chen, 2018. "Nonlinear Time Series Clustering Based on Kolmogorov-Smirnov 2D Statistic," Journal of Classification, Springer;The Classification Society, vol. 35(3), pages 394-421, October.
    6. Mahmoudi, Mohammad Reza, 2021. "A computational technique to classify several fractional Brownian motion processes," Chaos, Solitons & Fractals, Elsevier, vol. 150(C).
    7. Jentsch, Carsten & Pauly, Markus, 2012. "A note on using periodogram-based distances for comparing spectral densities," Statistics & Probability Letters, Elsevier, vol. 82(1), pages 158-164.
    8. Xu Gao & Babak Shahbaba & Hernando Ombao, 2018. "Modeling Binary Time Series Using Gaussian Processes with Application to Predicting Sleep States," Journal of Classification, Springer;The Classification Society, vol. 35(3), pages 549-579, October.
    9. Tianbo Chen & Ying Sun & Carolina Euan & Hernando Ombao, 2021. "Clustering Brain Signals: a Robust Approach Using Functional Data Ranking," Journal of Classification, Springer;The Classification Society, vol. 38(3), pages 425-442, October.
    10. Mahdi Massahi & Masoud Mahootchi & Alireza Arshadi Khamseh, 2020. "Development of an efficient cluster-based portfolio optimization model under realistic market conditions," Empirical Economics, Springer, vol. 59(5), pages 2423-2442, November.
    11. Elizabeth Ann Maharaj & Pierpaolo D’Urso & Don Galagedera, 2010. "Wavelet-based Fuzzy Clustering of Time Series," Journal of Classification, Springer;The Classification Society, vol. 27(2), pages 231-275, September.
    12. Lei Jin & Suojin Wang, 2016. "A New Test for Checking the Equality of the Correlation Structures of two time Series," Journal of Time Series Analysis, Wiley Blackwell, vol. 37(3), pages 355-368, May.
    13. Caiado, Jorge & Crato, Nuno & Peña, Daniel, 2006. "An interpolated periodogram-based metric for comparison of time series with unequal lengths," MPRA Paper 2075, University Library of Munich, Germany.
    14. Jorge Caiado & Nuno Crato, 2010. "Identifying common dynamic features in stock returns," Quantitative Finance, Taylor & Francis Journals, vol. 10(7), pages 797-807.
    15. Zhao, Xin & Barber, Stuart & Taylor, Charles C. & Milan, Zoka, 2018. "Classification tree methods for panel data using wavelet-transformed time series," Computational Statistics & Data Analysis, Elsevier, vol. 127(C), pages 204-216.
    16. João A. Bastos & Jorge Caiado, 2021. "On the classification of financial data with domain agnostic features," Working Papers REM 2021/0185, ISEG - Lisbon School of Economics and Management, REM, Universidade de Lisboa.
    17. Aykroyd, Robert G. & Barber, Stuart & Miller, Luke R., 2016. "Classification of multiple time signals using localized frequency characteristics applied to industrial process monitoring," Computational Statistics & Data Analysis, Elsevier, vol. 94(C), pages 351-362.
    18. Liu, Shen & Maharaj, Elizabeth Ann, 2013. "A hypothesis test using bias-adjusted AR estimators for classifying time series in small samples," Computational Statistics & Data Analysis, Elsevier, vol. 60(C), pages 32-49.
    19. Preuß, Philip & Hildebrandt, Thimo, 2013. "Comparing spectral densities of stationary time series with unequal sample sizes," Statistics & Probability Letters, Elsevier, vol. 83(4), pages 1174-1183.
    20. Harvill, Jane L. & Ravishanker, Nalini & Ray, Bonnie K., 2013. "Bispectral-based methods for clustering time series," Computational Statistics & Data Analysis, Elsevier, vol. 64(C), pages 113-131.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:jclass:v:35:y:2018:i:1:d:10.1007_s00357-018-9250-5. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.