IDEAS home Printed from https://ideas.repec.org/p/ise/remwps/wp01852021.html
   My bibliography  Save this paper

On the classification of financial data with domain agnostic features

Author

Listed:
  • João A. Bastos
  • Jorge Caiado

Abstract

We compare a data-driven domain agnostic set of canonical features with a smaller collection of features that capture well-known stylized facts about financial asset returns. We show that these facts discriminate better different asset types than general-purpose features. Therefore, financial time series analysis is a domain where well-informed expert knowledge may not be disregarded in favor of agnosticrepresentations of the data.

Suggested Citation

  • João A. Bastos & Jorge Caiado, 2021. "On the classification of financial data with domain agnostic features," Working Papers REM 2021/0185, ISEG - Lisbon School of Economics and Management, REM, Universidade de Lisboa.
  • Handle: RePEc:ise:remwps:wp01852021
    as

    Download full text from publisher

    File URL: https://rem.rc.iseg.ulisboa.pt/wps/pdf/REM_WP_0185_2021.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Jorge Caiado & Nuno Crato, 2010. "Identifying common dynamic features in stock returns," Quantitative Finance, Taylor & Francis Journals, vol. 10(7), pages 797-807.
    2. Ding, Zhuanxin & Granger, Clive W. J. & Engle, Robert F., 1993. "A long memory property of stock market returns and a new model," Journal of Empirical Finance, Elsevier, vol. 1(1), pages 83-106, June.
    3. Caiado, Jorge & Crato, Nuno, 2007. "A GARCH-based method for clustering of financial time series: International stock markets evidence," MPRA Paper 2074, University Library of Munich, Germany.
    4. Harvey, Campbell R, 1995. "Predictable Risk and Returns in Emerging Markets," The Review of Financial Studies, Society for Financial Studies, vol. 8(3), pages 773-816.
    5. Caiado, Jorge & Crato, Nuno & Pena, Daniel, 2006. "A periodogram-based metric for time series classification," Computational Statistics & Data Analysis, Elsevier, vol. 50(10), pages 2668-2684, June.
    6. Clarida, Richard & Gali, Jordi, 1994. "Sources of real exchange-rate fluctuations: How important are nominal shocks?," Carnegie-Rochester Conference Series on Public Policy, Elsevier, vol. 41(1), pages 1-56, December.
    7. Granger, Clive W. J. & Ding, Zhuanxin, 1996. "Varieties of long memory models," Journal of Econometrics, Elsevier, vol. 73(1), pages 61-77, July.
    8. Bekaert, Geert & Harvey, Campbell R., 1997. "Emerging equity market volatility," Journal of Financial Economics, Elsevier, vol. 43(1), pages 29-77, January.
    9. Franses,Philip Hans & Dijk,Dick van, 2000. "Non-Linear Time Series Models in Empirical Finance," Cambridge Books, Cambridge University Press, number 9780521770415, October.
    10. Jorge Caiado & Nuno Crato & Pilar Poncela, 2020. "A fragmented-periodogram approach for clustering big data time series," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(1), pages 117-146, March.
    11. Maharaj, Elizabeth Ann & D’Urso, Pierpaolo, 2010. "A coherence-based approach for the pattern recognition of time series," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 389(17), pages 3516-3537.
    12. Zakoian, Jean-Michel, 1994. "Threshold heteroskedastic models," Journal of Economic Dynamics and Control, Elsevier, vol. 18(5), pages 931-955, September.
    13. Alonso, Andres M. & Maharaj, Elizabeth A., 2006. "Comparison of time series using subsampling," Computational Statistics & Data Analysis, Elsevier, vol. 50(10), pages 2589-2599, June.
    14. Otranto, Edoardo, 2008. "Clustering heteroskedastic time series by model-based procedures," Computational Statistics & Data Analysis, Elsevier, vol. 52(10), pages 4685-4698, June.
    15. Domenico Piccolo, 1990. "A Distance Measure For Classifying Arima Models," Journal of Time Series Analysis, Wiley Blackwell, vol. 11(2), pages 153-164, March.
    16. Bollerslev, Tim, 1986. "Generalized autoregressive conditional heteroskedasticity," Journal of Econometrics, Elsevier, vol. 31(3), pages 307-327, April.
    17. Pena D. & Prieto F.J., 2001. "Cluster Identification Using Projections," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1433-1445, December.
    18. R. Cont, 2001. "Empirical properties of asset returns: stylized facts and statistical issues," Quantitative Finance, Taylor & Francis Journals, vol. 1(2), pages 223-236.
    19. Sánchez Granero, M.A. & Trinidad Segovia, J.E. & García Pérez, J., 2008. "Some comments on Hurst exponent and the long memory processes on capital markets," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 387(22), pages 5543-5551.
    20. Caiado, Jorge & Crato, Nuno & Peña, Daniel, 2009. "Comparison of time series with unequal length in the frequency domain," MPRA Paper 15310, University Library of Munich, Germany.
    21. Benoit Mandelbrot, 2015. "The Variation of Certain Speculative Prices," World Scientific Book Chapters, in: Anastasios G Malliaris & William T Ziemba (ed.), THE WORLD SCIENTIFIC HANDBOOK OF FUTURES MARKETS, chapter 3, pages 39-78, World Scientific Publishing Co. Pte. Ltd..
    22. João A. Bastos & Jorge Caiado, 2014. "Clustering financial time series with variance ratio statistics," Quantitative Finance, Taylor & Francis Journals, vol. 14(12), pages 2121-2133, December.
    23. Kraus, Alan & Litzenberger, Robert H, 1976. "Skewness Preference and the Valuation of Risk Assets," Journal of Finance, American Finance Association, vol. 31(4), pages 1085-1100, September.
    24. Galeano, Pedro & Pena, Daniel & Tsay, Ruey S., 2006. "Outlier Detection in Multivariate Time Series by Projection Pursuit," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 654-669, June.
    25. Engle, Robert F, 1982. "Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation," Econometrica, Econometric Society, vol. 50(4), pages 987-1007, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Lúcio, Francisco & Caiado, Jorge, 2022. "COVID-19 and Stock Market Volatility: A Clustering Approach for S&P 500 Industry Indices," Finance Research Letters, Elsevier, vol. 49(C).
    2. Roy Cerqueti & Pierpaolo D’Urso & Livia Giovanni & Raffaele Mattera & Vincenzina Vitale, 2024. "Fuzzy clustering of time series based on weighted conditional higher moments," Computational Statistics, Springer, vol. 39(6), pages 3091-3114, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lúcio, Francisco & Caiado, Jorge, 2022. "COVID-19 and Stock Market Volatility: A Clustering Approach for S&P 500 Industry Indices," Finance Research Letters, Elsevier, vol. 49(C).
    2. D’Urso, Pierpaolo & Cappelli, Carmela & Di Lallo, Dario & Massari, Riccardo, 2013. "Clustering of financial time series," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 392(9), pages 2114-2129.
    3. B. Lafuente-Rego & P. D’Urso & J. A. Vilar, 2020. "Robust fuzzy clustering based on quantile autocovariances," Statistical Papers, Springer, vol. 61(6), pages 2393-2448, December.
    4. Chebbi, Ali & Hedhli, Amel, 2022. "Revisiting the accuracy of standard VaR methods for risk assessment: Using the Copula–EVT multidimensional approach for stock markets in the MENA region," The Quarterly Review of Economics and Finance, Elsevier, vol. 84(C), pages 430-445.
    5. Assaf, Ata, 2015. "Value-at-Risk analysis in the MENA equity markets: Fat tails and conditional asymmetries in return distributions," Journal of Multinational Financial Management, Elsevier, vol. 29(C), pages 30-45.
    6. Roy Cerqueti & Pierpaolo D’Urso & Livia Giovanni & Raffaele Mattera & Vincenzina Vitale, 2024. "Fuzzy clustering of time series based on weighted conditional higher moments," Computational Statistics, Springer, vol. 39(6), pages 3091-3114, September.
    7. Charles, Amélie, 2010. "The day-of-the-week effects on the volatility: The role of the asymmetry," European Journal of Operational Research, Elsevier, vol. 202(1), pages 143-152, April.
    8. Kumar Arya & Sahoo Jyotirmayee & Sahoo Jyotsnarani & Nanda Subhashree & Debyani Devi, 2024. "Exploring Asymmetric GARCH Models for Predicting Indian Base Metal Price Volatility," Folia Oeconomica Stetinensia, Sciendo, vol. 24(1), pages 105-123.
    9. De Santis, Giorgio & imrohoroglu, Selahattin, 1997. "Stock returns and volatility in emerging financial markets," Journal of International Money and Finance, Elsevier, vol. 16(4), pages 561-579, August.
    10. Amare Wubishet Ayele & Emmanuel Gabreyohannes & Yohannes Yebabe Tesfay, 2017. "Macroeconomic Determinants of Volatility for the Gold Price in Ethiopia: The Application of GARCH and EWMA Volatility Models," Global Business Review, International Management Institute, vol. 18(2), pages 308-326, April.
    11. Liu, Shen & Maharaj, Elizabeth Ann, 2013. "A hypothesis test using bias-adjusted AR estimators for classifying time series in small samples," Computational Statistics & Data Analysis, Elsevier, vol. 60(C), pages 32-49.
    12. El Jebari, Ouael & Hakmaoui, Abdelati, 2018. "GARCH Family Models vs EWMA: Which is the Best Model to Forecast Volatility of the Moroccan Stock Exchange Market? || Modelos de la familia GARCH vs EWMA: ¿cuál es el mejor modelo para pronosticar la ," Revista de Métodos Cuantitativos para la Economía y la Empresa = Journal of Quantitative Methods for Economics and Business Administration, Universidad Pablo de Olavide, Department of Quantitative Methods for Economics and Business Administration, vol. 26(1), pages 237-249, Diciembre.
    13. Gabriel, Vítor, 2015. "Sensitivity, Persistence and Asymmetric Effects in International Stock Market Volatility during the Global Financial Crisis || Efectos de sensibilidad, persistencia y asimetría en la volatilidad de lo," Revista de Métodos Cuantitativos para la Economía y la Empresa = Journal of Quantitative Methods for Economics and Business Administration, Universidad Pablo de Olavide, Department of Quantitative Methods for Economics and Business Administration, vol. 19(1), pages 42-65, June.
    14. Vacca, Gianmarco & Zoia, Maria Grazia & Bagnato, Luca, 2022. "Forecasting in GARCH models with polynomially modified innovations," International Journal of Forecasting, Elsevier, vol. 38(1), pages 117-141.
    15. Roy Cerqueti & Massimiliano Giacalone & Raffaele Mattera, 2020. "Skewed non-Gaussian GARCH models for cryptocurrencies volatility modelling," Papers 2004.11674, arXiv.org.
    16. Hatemi-J, Abdulnasser, 2013. "A New Asymmetric GARCH Model: Testing, Estimation and Application," MPRA Paper 45170, University Library of Munich, Germany.
    17. Dominique Guegan, 2005. "How can we Define the Concept of Long Memory? An Econometric Survey," Econometric Reviews, Taylor & Francis Journals, vol. 24(2), pages 113-149.
    18. Ekong, Christopher N. & Onye, Kenneth U., 2017. "Application of Garch Models to Estimate and Predict Financial Volatility of Daily Stock Returns in Nigeria," MPRA Paper 88309, University Library of Munich, Germany.
    19. Samet Gunay & Audil Rashid Khaki, 2018. "Best Fitting Fat Tail Distribution for the Volatilities of Energy Futures: Gev, Gat and Stable Distributions in GARCH and APARCH Models," JRFM, MDPI, vol. 11(2), pages 1-19, June.
    20. Trino-Manuel Ñíguez, 2003. "Volatility And Var Forecasting For The Ibex-35 Stock-Return Index Using Figarch-Type Processes And Different Evaluation Criteria," Working Papers. Serie AD 2003-33, Instituto Valenciano de Investigaciones Económicas, S.A. (Ivie).

    More about this item

    Keywords

    Financial economics; Time series; Clustering; Classification; Machine learning;
    All these keywords.

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ise:remwps:wp01852021. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sandra Araújo (email available below). General contact details of provider: https://rem.rc.iseg.ulisboa.pt/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.