IDEAS home Printed from https://ideas.repec.org/a/spr/stpapr/v61y2020i6d10.1007_s00362-018-1053-6.html
   My bibliography  Save this article

Robust fuzzy clustering based on quantile autocovariances

Author

Listed:
  • B. Lafuente-Rego

    (University of A Coruña)

  • P. D’Urso

    (Sapienza University of Rome)

  • J. A. Vilar

    (University of A Coruña)

Abstract

Robustness to the presence of outliers in time series clustering is addressed. Assuming that the clustering principle is to group realizations of series generated from similar dependence structures, three robust versions of a fuzzy C-medoids model based on comparing sample quantile autocovariances are proposed by considering, respectively, the so-called metric, noise, and trimmed approaches. Each method achieves its robustness against outliers in different manner. The metric approach considers a suitable transformation of the distance aimed at smoothing the effect of the outliers, the noise approach brings together the outliers into a separated artificial cluster, and the trimmed approach removes a fraction of the time series. All the proposed approaches take advantage of the high capability of the quantile autocovariances to discriminate between independent realizations from a broad range of stationary processes, including linear, non-linear and conditional heteroskedastic models. An extensive simulation study involving scenarios with different generating models and contaminated with outliers is performed. Robustness against (i) outliers generated from different generating patterns, and (ii) outliers characterized by isolated, temporary or persistent level changes is evaluated. The influence of the input parameters required by the different algorithms is analyzed. Regardless of the considered models, the results show that the proposed robust procedures are able to neutralize the effect of the anomalous series preserving the true clustering structure, and fairly outperform other robust algorithms based on alternative metrics. Two applications to financial data sets permit to illustrate the usefulness of the proposed models.

Suggested Citation

  • B. Lafuente-Rego & P. D’Urso & J. A. Vilar, 2020. "Robust fuzzy clustering based on quantile autocovariances," Statistical Papers, Springer, vol. 61(6), pages 2393-2448, December.
  • Handle: RePEc:spr:stpapr:v:61:y:2020:i:6:d:10.1007_s00362-018-1053-6
    DOI: 10.1007/s00362-018-1053-6
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00362-018-1053-6
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00362-018-1053-6?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Coppi, Renato & D'Urso, Pierpaolo, 2006. "Fuzzy unsupervised classification of multivariate time trajectories with the Shannon entropy regularization," Computational Statistics & Data Analysis, Elsevier, vol. 50(6), pages 1452-1477, March.
    2. Giovanni De Luca & Paola Zuccolotto, 2017. "Dynamic tail dependence clustering of financial time series," Statistical Papers, Springer, vol. 58(3), pages 641-657, September.
    3. Caiado, Jorge & Crato, Nuno & Pena, Daniel, 2006. "A periodogram-based metric for time series classification," Computational Statistics & Data Analysis, Elsevier, vol. 50(10), pages 2668-2684, June.
    4. Renato Coppi & Pierpaolo D'Urso, 2002. "Fuzzy K-means clustering models for triangular fuzzy time trajectories," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 11(1), pages 21-40, February.
    5. Jorge Caiado & Nuno Crato, 2010. "Identifying common dynamic features in stock returns," Quantitative Finance, Taylor & Francis Journals, vol. 10(7), pages 797-807.
    6. Linton, O. & Whang, Yoon-Jae, 2007. "The quantilogram: With an application to evaluating directional predictability," Journal of Econometrics, Elsevier, vol. 141(1), pages 250-282, November.
    7. Maharaj, Elizabeth Ann & D’Urso, Pierpaolo, 2010. "A coherence-based approach for the pattern recognition of time series," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 389(17), pages 3516-3537.
    8. Alonso, Andres M. & Maharaj, Elizabeth A., 2006. "Comparison of time series using subsampling," Computational Statistics & Data Analysis, Elsevier, vol. 50(10), pages 2589-2599, June.
    9. Otranto, Edoardo, 2008. "Clustering heteroskedastic time series by model-based procedures," Computational Statistics & Data Analysis, Elsevier, vol. 52(10), pages 4685-4698, June.
    10. Juan Vilar & José Vilar & Sonia Pértega, 2009. "Classifying Time Series Data: A Nonparametric Approach," Journal of Classification, Springer;The Classification Society, vol. 26(1), pages 3-28, April.
    11. Domenico Piccolo, 1990. "A Distance Measure For Classifying Arima Models," Journal of Time Series Analysis, Wiley Blackwell, vol. 11(2), pages 153-164, March.
    12. Otranto, Edoardo, 2010. "Identifying financial time series with similar dynamic conditional correlation," Computational Statistics & Data Analysis, Elsevier, vol. 54(1), pages 1-15, January.
    13. Ta-Hsin Li, 2014. "Quantile Periodogram And Time-Dependent Variance," Journal of Time Series Analysis, Wiley Blackwell, vol. 35(4), pages 322-340, July.
    14. Alonso, A.M. & Berrendero, J.R. & Hernandez, A. & Justel, A., 2006. "Time series clustering based on forecast densities," Computational Statistics & Data Analysis, Elsevier, vol. 51(2), pages 762-776, November.
    15. Luis García-Escudero & Alfonso Gordaliza & Carlos Matrán & Agustín Mayo-Iscar, 2010. "A review of robust clustering methods," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 4(2), pages 89-109, September.
    16. Caiado, Jorge & Crato, Nuno & Peña, Daniel, 2009. "Comparison of time series with unequal length in the frequency domain," MPRA Paper 15310, University Library of Munich, Germany.
    17. Montero, Pablo & Vilar, José A., 2014. "TSclust: An R Package for Time Series Clustering," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 62(i01).
    18. João A. Bastos & Jorge Caiado, 2014. "Clustering financial time series with variance ratio statistics," Quantitative Finance, Taylor & Francis Journals, vol. 14(12), pages 2121-2133, December.
    19. Sonia Díaz & José Vilar, 2010. "Comparing Several Parametric and Nonparametric Approaches to Time Series Clustering: A Simulation Study," Journal of Classification, Springer;The Classification Society, vol. 27(3), pages 333-362, November.
    20. Coppi, Renato & D'Urso, Pierpaolo, 2003. "Three-way fuzzy clustering models for LR fuzzy time trajectories," Computational Statistics & Data Analysis, Elsevier, vol. 43(2), pages 149-177, June.
    21. Amendola, Alessandra & Christian, Francq, 2009. "Concepts and tools for nonlinear time series modelling," MPRA Paper 15140, University Library of Munich, Germany.
    22. Heungsun Hwang & Wayne Desarbo & Yoshio Takane, 2007. "Fuzzy Clusterwise Generalized Structured Component Analysis," Psychometrika, Springer;The Psychometric Society, vol. 72(2), pages 181-198, June.
    23. Floriello, Davide & Vitelli, Valeria, 2017. "Sparse clustering of functional data," Journal of Multivariate Analysis, Elsevier, vol. 154(C), pages 1-18.
    24. Willem Heiser & Patrick Groenen, 1997. "Cluster differences scaling with a within-clusters loss component and a fuzzy successive approximation strategy to avoid local minima," Psychometrika, Springer;The Psychometric Society, vol. 62(1), pages 63-83, March.
    25. Tomasz Górecki & Mirosław Krzyśko & Łukasz Waszak & Waldemar Wołyński, 2018. "Selected statistical methods of data analysis for multivariate functional data," Statistical Papers, Springer, vol. 59(1), pages 153-182, March.
    26. Ruey S. Tsay, 2016. "Some Methods for Analyzing Big Dependent Data," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 673-688, October.
    27. Fabrizio Durante & Roberta Pappadà & Nicola Torelli, 2014. "Clustering of financial time series in risky scenarios," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 8(4), pages 359-376, December.
    28. Seong Chae & Chansoo Kim & Jong-Min Kim & William Warde, 2008. "Cluster analysis using different correlation coefficients," Statistical Papers, Springer, vol. 49(4), pages 715-727, October.
    29. Aielli, Gian Piero & Caporin, Massimiliano, 2013. "Fast clustering of GARCH processes via Gaussian mixture models," Mathematics and Computers in Simulation (MATCOM), Elsevier, vol. 94(C), pages 205-222.
    30. Fabrizio Durante & Roberta Pappadà & Nicola Torelli, 2015. "Clustering of time series via non-parametric tail dependence estimation," Statistical Papers, Springer, vol. 56(3), pages 701-721, August.
    31. Slaets, Leen & Claeskens, Gerda & Hubert, Mia, 2012. "Phase and amplitude-based clustering for functional data," Computational Statistics & Data Analysis, Elsevier, vol. 56(7), pages 2360-2374.
    32. Vilar, J.A. & Alonso, A.M. & Vilar, J.M., 2010. "Non-linear time series clustering based on non-parametric forecast densities," Computational Statistics & Data Analysis, Elsevier, vol. 54(11), pages 2850-2865, November.
    33. Renato Coppi & Pierpaolo D’Urso & Paolo Giordani, 2010. "A Fuzzy Clustering Model for Multivariate Spatial Time Series," Journal of Classification, Springer;The Classification Society, vol. 27(1), pages 54-88, March.
    34. Tobias Kley & Stanislav Volgushev & Holger Dette & Marc Hallin, 2014. "Quantile Spectral Processes: Asymptotic Analysis and Inference," Working Papers ECARES ECARES 2014-07, ULB -- Universite Libre de Bruxelles.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Pierpaolo D’Urso & Livia Giovanni & Riccardo Massari & Dario Lallo, 2013. "Noise fuzzy clustering of time series by autoregressive metric," METRON, Springer;Sapienza Università di Roma, vol. 71(3), pages 217-243, November.
    2. João A. Bastos & Jorge Caiado, 2021. "On the classification of financial data with domain agnostic features," Working Papers REM 2021/0185, ISEG - Lisbon School of Economics and Management, REM, Universidade de Lisboa.
    3. Pierpaolo D’Urso & Livia De Giovanni & Riccardo Massari & Francesca G. M. Sica, 2019. "Cross Sectional and Longitudinal Fuzzy Clustering of the NUTS and Positioning of the Italian Regions with Respect to the Regional Competitiveness Index (RCI) Indicators with Contiguity Constraints," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 146(3), pages 609-650, December.
    4. Liu, Shen & Maharaj, Elizabeth Ann, 2013. "A hypothesis test using bias-adjusted AR estimators for classifying time series in small samples," Computational Statistics & Data Analysis, Elsevier, vol. 60(C), pages 32-49.
    5. Sonia Díaz & José Vilar, 2010. "Comparing Several Parametric and Nonparametric Approaches to Time Series Clustering: A Simulation Study," Journal of Classification, Springer;The Classification Society, vol. 27(3), pages 333-362, November.
    6. Jorge Caiado & Nuno Crato & Pilar Poncela, 2020. "A fragmented-periodogram approach for clustering big data time series," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(1), pages 117-146, March.
    7. De Luca Giovanni & Zuccolotto Paola, 2017. "A double clustering algorithm for financial time series based on extreme events," Statistics & Risk Modeling, De Gruyter, vol. 34(1-2), pages 1-12, June.
    8. Sipan Aslan & Ceylan Yozgatligil & Cem Iyigun, 2018. "Temporal clustering of time series via threshold autoregressive models: application to commodity prices," Annals of Operations Research, Springer, vol. 260(1), pages 51-77, January.
    9. Ozan Cinar & Ozlem Ilk & Cem Iyigun, 2018. "Clustering of short time-course gene expression data with dissimilar replicates," Annals of Operations Research, Springer, vol. 263(1), pages 405-428, April.
    10. João A. Bastos & Jorge Caiado, 2014. "Clustering financial time series with variance ratio statistics," Quantitative Finance, Taylor & Francis Journals, vol. 14(12), pages 2121-2133, December.
    11. Liu, Shen & Maharaj, Elizabeth Ann & Inder, Brett, 2014. "Polarization of forecast densities: A new approach to time series classification," Computational Statistics & Data Analysis, Elsevier, vol. 70(C), pages 345-361.
    12. Beibei Zhang & Rong Chen, 2018. "Nonlinear Time Series Clustering Based on Kolmogorov-Smirnov 2D Statistic," Journal of Classification, Springer;The Classification Society, vol. 35(3), pages 394-421, October.
    13. Mahmoudi, Mohammad Reza, 2021. "A computational technique to classify several fractional Brownian motion processes," Chaos, Solitons & Fractals, Elsevier, vol. 150(C).
    14. De Gregorio, Alessandro & Maria Iacus, Stefano, 2010. "Clustering of discretely observed diffusion processes," Computational Statistics & Data Analysis, Elsevier, vol. 54(2), pages 598-606, February.
    15. E. Otranto, 2011. "Classification of Volatility in Presence of Changes in Model Parameters," Working Paper CRENoS 201113, Centre for North South Economic Research, University of Cagliari and Sassari, Sardinia.
    16. Carolina Euán & Hernando Ombao & Joaquín Ortega, 2018. "The Hierarchical Spectral Merger Algorithm: A New Time Series Clustering Procedure," Journal of Classification, Springer;The Classification Society, vol. 35(1), pages 71-99, April.
    17. Giovanni De Luca & Paola Zuccolotto, 2011. "A tail dependence-based dissimilarity measure for financial time series clustering," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 5(4), pages 323-340, December.
    18. Jin, Lei, 2011. "A data-driven test to compare two or multiple time series," Computational Statistics & Data Analysis, Elsevier, vol. 55(6), pages 2183-2196, June.
    19. Gautier Marti & Frank Nielsen & Miko{l}aj Bi'nkowski & Philippe Donnat, 2017. "A review of two decades of correlations, hierarchies, networks and clustering in financial markets," Papers 1703.00485, arXiv.org, revised Nov 2020.
    20. Lúcio, Francisco & Caiado, Jorge, 2022. "COVID-19 and Stock Market Volatility: A Clustering Approach for S&P 500 Industry Indices," Finance Research Letters, Elsevier, vol. 49(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stpapr:v:61:y:2020:i:6:d:10.1007_s00362-018-1053-6. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.