IDEAS home Printed from https://ideas.repec.org/p/hal/wpaper/hal-01196883.html
   My bibliography  Save this paper

Toward a generic representation of random variables for machine learning

Author

Listed:
  • Gautier Marti

    (X - École polytechnique, Hellebore Capital Management)

  • Philippe Very

    (Hellebore Capital Management)

  • Philippe Donnat

    (Hellebore Capital Management)

Abstract

This paper presents a pre-processing and a distance which improve the performance of machine learning algorithms working on independent and identically distributed stochastic processes. We introduce a novel non-parametric approach to represent random variables which splits apart dependency and distribution without losing any information. We also propound an associated metric leveraging this representation and its statistical estimate. Besides experiments on synthetic datasets, the benefits of our contribution is illustrated through the example of clustering financial time series, for instance prices from the credit default swaps market. Results are available on the website www.datagrapple.com and an IPython Notebook tutorial is available at www.datagrapple.com/Tech for reproducible research.

Suggested Citation

  • Gautier Marti & Philippe Very & Philippe Donnat, 2015. "Toward a generic representation of random variables for machine learning," Working Papers hal-01196883, HAL.
  • Handle: RePEc:hal:wpaper:hal-01196883
    Note: View the original document on HAL open archive server: https://hal.science/hal-01196883
    as

    Download full text from publisher

    File URL: https://hal.science/hal-01196883/document
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. R. Mantegna, 1999. "Hierarchical structure in financial markets," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 11(1), pages 193-197, September.
    2. Deheuvels, Paul, 1981. "An asymptotic decomposition for multivariate distribution-free tests of independence," Journal of Multivariate Analysis, Elsevier, vol. 11(1), pages 102-113, March.
    3. Tumminello, Michele & Lillo, Fabrizio & Mantegna, Rosario N., 2010. "Correlation, hierarchies, and networks in financial markets," Journal of Economic Behavior & Organization, Elsevier, vol. 75(1), pages 40-58, July.
    4. Nicolai Meinshausen & Peter Bühlmann, 2010. "Stability selection," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 72(4), pages 417-473, September.
    5. Tola, Vincenzo & Lillo, Fabrizio & Gallegati, Mauro & Mantegna, Rosario N., 2008. "Cluster analysis for portfolio optimization," Journal of Economic Dynamics and Control, Elsevier, vol. 32(1), pages 235-258, January.
    6. Lawrence Hubert & Phipps Arabie, 1985. "Comparing partitions," Journal of Classification, Springer;The Classification Society, vol. 2(1), pages 193-218, December.
    7. T. Di Matteo & F. Pozzi & T. Aste, 2010. "The use of dynamical networks to detect the hierarchical organization of financial market sectors," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 73(1), pages 3-11, January.
    8. Victoria Lemieux & Payam S. Rahmdel & Rick Walker & B.L. William Wong & Mark D. Flood, 2015. "Clustering Techniques and Their Effect on Portfolio Formation and Risk Analysis," Staff Discussion Papers 15-01, Office of Financial Research, US Department of the Treasury.
    9. Martens, Martin & Poon, Ser-Huang, 2001. "Returns synchronization and daily correlation dynamics between international stock markets," Journal of Banking & Finance, Elsevier, vol. 25(10), pages 1805-1827, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Nicoló Musmeci & Tomaso Aste & T Di Matteo, 2015. "Relation between Financial Market Structure and the Real Economy: Comparison between Clustering Methods," PLOS ONE, Public Library of Science, vol. 10(3), pages 1-24, March.
    2. Gautier Marti & Frank Nielsen & Philippe Donnat & S'ebastien Andler, 2016. "On clustering financial time series: a need for distances between dependent random variables," Papers 1603.07822, arXiv.org.
    3. Nicol'o Musmeci & Tomaso Aste & Tiziana Di Matteo, 2014. "Risk diversification: a study of persistence with a filtered correlation-network approach," Papers 1410.5621, arXiv.org.
    4. Nicolo Musmeci & Tomaso Aste & Tiziana Di Matteo, 2014. "Relation between Financial Market Structure and the Real Economy: Comparison between Clustering Methods," Papers 1406.0496, arXiv.org, revised Jan 2015.
    5. Musmeci, Nicoló & Aste, Tomaso & Di Matteo, T., 2015. "Relation between financial market structure and the real economy: comparison between clustering methods," LSE Research Online Documents on Economics 61644, London School of Economics and Political Science, LSE Library.
    6. Gautier Marti & Philippe Very & Philippe Donnat & Frank Nielsen, 2015. "A proposal of a methodological framework with experimental guidelines to investigate clustering stability on financial time series," Papers 1509.05475, arXiv.org.
    7. Sensoy, Ahmet & Tabak, Benjamin M., 2014. "Dynamic spanning trees in stock market networks: The case of Asia-Pacific," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 414(C), pages 387-402.
    8. Gautier Marti & Frank Nielsen & Miko{l}aj Bi'nkowski & Philippe Donnat, 2017. "A review of two decades of correlations, hierarchies, networks and clustering in financial markets," Papers 1703.00485, arXiv.org, revised Nov 2020.
    9. Andrea Di Iura, 2022. "Comparison of empirical and shrinkage correlation algorithm for clustering methods in the futures market," SN Business & Economics, Springer, vol. 2(8), pages 1-17, August.
    10. Teh, Boon Kin & Goo, Yik Wen & Lian, Tong Wei & Ong, Wei Guang & Choi, Wen Ting & Damodaran, Mridula & Cheong, Siew Ann, 2015. "The Chinese Correction of February 2007: How financial hierarchies change in a market crash," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 424(C), pages 225-241.
    11. Kumar, Sudarshan & Bansal, Avijit & Chakrabarti, Anindya S., 2019. "Ripples on financial networks," IIMA Working Papers WP 2019-10-01, Indian Institute of Management Ahmedabad, Research and Publication Department.
    12. Sudarshan Kumar & Tiziana Di Matteo & Anindya S. Chakrabarti, 2020. "Disentangling shock diffusion on complex networks: Identification through graph planarity," Papers 2001.01518, arXiv.org.
    13. Chen, Yanhua & Li, Youwei & Pantelous, Athanasios A. & Stanley, H. Eugene, 2022. "Short-run disequilibrium adjustment and long-run equilibrium in the international stock markets: A network-based approach," International Review of Financial Analysis, Elsevier, vol. 79(C).
    14. Nicolò Musmeci & Vincenzo Nicosia & Tomaso Aste & Tiziana Di Matteo & Vito Latora, 2017. "The Multiplex Dependency Structure of Financial Markets," Complexity, Hindawi, vol. 2017, pages 1-13, September.
    15. Li, Yan & Jiang, Xiong-Fei & Tian, Yue & Li, Sai-Ping & Zheng, Bo, 2019. "Portfolio optimization based on network topology," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 515(C), pages 671-681.
    16. Peter N. Posch & Daniel Ullmann & Dominik Wied, 2019. "Detecting structural changes in large portfolios," Empirical Economics, Springer, vol. 56(4), pages 1341-1357, April.
    17. Zhao, Longfeng & Wang, Gang-Jin & Wang, Mingang & Bao, Weiqi & Li, Wei & Stanley, H. Eugene, 2018. "Stock market as temporal network," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 506(C), pages 1104-1112.
    18. Paolo Giudici & Gloria Polinesi & Alessandro Spelta, 2022. "Network models to improve robot advisory portfolios," Annals of Operations Research, Springer, vol. 313(2), pages 965-989, June.
    19. Jochen Papenbrock & Peter Schwendner, 2015. "Handling risk-on/risk-off dynamics with correlation regimes and correlation networks," Financial Markets and Portfolio Management, Springer;Swiss Society for Financial Market Research, vol. 29(2), pages 125-147, May.
    20. Musciotto, F. & Marotta, L. & Miccichè, S. & Mantegna, R.N., 2018. "Bootstrap validation of links of a minimum spanning tree," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 512(C), pages 1032-1043.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hal:wpaper:hal-01196883. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: CCSD (email available below). General contact details of provider: https://hal.archives-ouvertes.fr/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.