IDEAS home Printed from https://ideas.repec.org/p/arx/papers/1610.08104.html
   My bibliography  Save this paper

Cleaning large correlation matrices: tools from random matrix theory

Author

Listed:
  • Joel Bun
  • Jean-Philippe Bouchaud
  • Marc Potters

Abstract

This review covers recent results concerning the estimation of large covariance matrices using tools from Random Matrix Theory (RMT). We introduce several RMT methods and analytical techniques, such as the Replica formalism and Free Probability, with an emphasis on the Marchenko-Pastur equation that provides information on the resolvent of multiplicatively corrupted noisy matrices. Special care is devoted to the statistics of the eigenvectors of the empirical correlation matrix, which turn out to be crucial for many applications. We show in particular how these results can be used to build consistent "Rotationally Invariant" estimators (RIE) for large correlation matrices when there is no prior on the structure of the underlying process. The last part of this review is dedicated to some real-world applications within financial markets as a case in point. We establish empirically the efficacy of the RIE framework, which is found to be superior in this case to all previously proposed methods. The case of additively (rather than multiplicatively) corrupted noisy matrices is also dealt with in a special Appendix. Several open problems and interesting technical developments are discussed throughout the paper.

Suggested Citation

  • Joel Bun & Jean-Philippe Bouchaud & Marc Potters, 2016. "Cleaning large correlation matrices: tools from random matrix theory," Papers 1610.08104, arXiv.org.
  • Handle: RePEc:arx:papers:1610.08104
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/1610.08104
    File Function: Latest version
    Download Restriction: no

    References listed on IDEAS

    as
    1. Rémy Chicheportiche & J-P Bouchaud, 2015. "A nested factor model for non-linear dependencies in stock returns," Post-Print hal-01339978, HAL.
    2. Ledoit, Olivier & Wolf, Michael, 2004. "A well-conditioned estimator for large-dimensional covariance matrices," Journal of Multivariate Analysis, Elsevier, vol. 88(2), pages 365-411, February.
    3. Silverstein, Jack W., 1989. "On the eigenvectors of large dimensional sample covariance matrices," Journal of Multivariate Analysis, Elsevier, vol. 30(1), pages 1-16, July.
    4. Ivailo I. Dimov & Petter N. Kolm & Lee Maclin & Dan Y. C. Shiber, 2012. "Hidden noise structure and random matrix models of stock correlations," Quantitative Finance, Taylor & Francis Journals, vol. 12(4), pages 567-572, November.
    5. Merton, Robert C, 1973. "An Intertemporal Capital Asset Pricing Model," Econometrica, Econometric Society, vol. 41(5), pages 867-887, September.
    6. Couillet, Romain & Kammoun, Abla & Pascal, Frédéric, 2016. "Second order statistics of robust estimators of scatter. Application to GLRT detection for elliptical signals," Journal of Multivariate Analysis, Elsevier, vol. 143(C), pages 249-274.
    7. Hansen, Lars Peter, 1982. "Large Sample Properties of Generalized Method of Moments Estimators," Econometrica, Econometric Society, vol. 50(4), pages 1029-1054, July.
    8. repec:eee:csdana:v:115:y:2017:i:c:p:199-223 is not listed on IDEAS
    9. Stefano Ciliberti & Imre Kondor & Marc Mezard, 2007. "On the feasibility of portfolio optimization under expected shortfall," Quantitative Finance, Taylor & Francis Journals, vol. 7(4), pages 389-396.
    10. Tumminello, Michele & Lillo, Fabrizio & Mantegna, Rosario N., 2010. "Correlation, hierarchies, and networks in financial markets," Journal of Economic Behavior & Organization, Elsevier, vol. 75(1), pages 40-58, July.
    11. Harry Markowitz, 1952. "Portfolio Selection," Journal of Finance, American Finance Association, vol. 7(1), pages 77-91, March.
    12. George Kapetanios, 2004. "A New Method for Determining the Number of Factors in Factor Models with Large Datasets," Working Papers 525, Queen Mary University of London, School of Economics and Finance.
    13. Pafka, Szilárd & Kondor, Imre, 2003. "Noisy covariance matrices and portfolio optimization II," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 319(C), pages 487-494.
    14. Silverstein, J. W., 1995. "Strong Convergence of the Empirical Distribution of Eigenvalues of Large Dimensional Random Matrices," Journal of Multivariate Analysis, Elsevier, vol. 55(2), pages 331-339, November.
    15. Matteo Marsili, 2002. "Dissecting financial markets: Sectors and states," Papers cond-mat/0207156, arXiv.org.
    16. Haff, L. R., 1979. "An identity for the Wishart distribution with applications," Journal of Multivariate Analysis, Elsevier, vol. 9(4), pages 531-544, December.
    17. J.-P. Bouchaud & L. Laloux & M. A. Miceli & M. Potters, 2007. "Large dimension forecasting models and random singular value spectra," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 55(2), pages 201-207, January.
    18. Ledoit, Olivier & Wolf, Michael, 2003. "Improved estimation of the covariance matrix of stock returns with an application to portfolio selection," Journal of Empirical Finance, Elsevier, vol. 10(5), pages 603-621, December.
    19. Michele Tumminello & Fabrizio Lillo & Rosario Nunzio Mantegna, 2007. "Kullback-Leibler distance as a measure of the information filtered from multivariate data," Papers 0706.0168, arXiv.org.
    20. Yin, Y. Q., 1986. "Limiting spectral distribution for a class of random matrices," Journal of Multivariate Analysis, Elsevier, vol. 20(1), pages 50-68, October.
    21. Burda, Z. & Görlich, A. & Jarosz, A. & Jurkiewicz, J., 2004. "Signal and noise in correlation matrix," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 343(C), pages 295-310.
    22. Matteo Marsili, 2002. "Dissecting financial markets: sectors and states," Quantitative Finance, Taylor & Francis Journals, vol. 2(4), pages 297-302.
    23. repec:dau:papers:123456789/10916 is not listed on IDEAS
    24. Ester Pantaleo & Michele Tumminello & Fabrizio Lillo & Rosario Mantegna, 2011. "When do improved covariance matrix estimators enhance portfolio optimization? An empirical comparative study of nine estimators," Quantitative Finance, Taylor & Francis Journals, vol. 11(7), pages 1067-1080.
    25. Paul, Debashis & Silverstein, Jack W., 2009. "No eigenvalues outside the support of the limiting empirical spectral distribution of a separable covariance matrix," Journal of Multivariate Analysis, Elsevier, vol. 100(1), pages 37-57, January.
    26. Silverstein, J. W. & Bai, Z. D., 1995. "On the Empirical Distribution of Eigenvalues of a Class of Large Dimensional Random Matrices," Journal of Multivariate Analysis, Elsevier, vol. 54(2), pages 175-192, August.
    27. Thilo A. Schmitt & Desislava Chetalova & Rudi Schafer & Thomas Guhr, 2013. "Non-Stationarity in Financial Time Series and Generic Features," Papers 1304.5130, arXiv.org, revised May 2013.
    28. Haff, L. R., 1977. "Minimax estimators for a multinormal precision matrix," Journal of Multivariate Analysis, Elsevier, vol. 7(3), pages 374-385, September.
    29. Ledoit, Olivier & Wolf, Michael, 2017. "Numerical implementation of the QuEST function," Computational Statistics & Data Analysis, Elsevier, vol. 115(C), pages 199-223.
    30. Chamberlain, Gary & Rothschild, Michael, 1983. "Arbitrage, Factor Structure, and Mean-Variance Analysis on Large Asset Markets," Econometrica, Econometric Society, vol. 51(5), pages 1281-1304, September.
    31. Silverstein, J. W. & Choi, S. I., 1995. "Analysis of the Limiting Spectral Distribution of Large Dimensional Random Matrices," Journal of Multivariate Analysis, Elsevier, vol. 54(2), pages 295-309, August.
    32. Fama, Eugene F. & French, Kenneth R., 1993. "Common risk factors in the returns on stocks and bonds," Journal of Financial Economics, Elsevier, vol. 33(1), pages 3-56, February.
    33. Couillet, Romain & Pascal, Frédéric & Silverstein, Jack W., 2015. "The random matrix regime of Maronna’s M-estimator with elliptically distributed samples," Journal of Multivariate Analysis, Elsevier, vol. 139(C), pages 56-78.
    34. Ledoit, Olivier & Wolf, Michael, 2015. "Spectrum estimation: A unified framework for covariance matrix estimation and PCA in large dimensions," Journal of Multivariate Analysis, Elsevier, vol. 139(C), pages 360-384.
    35. Benaych-Georges, Florent & Nadakuditi, Raj Rao, 2012. "The singular values and vectors of low rank perturbations of large rectangular random matrices," Journal of Multivariate Analysis, Elsevier, vol. 111(C), pages 120-135.
    36. Silverstein, Jack W., 1989. "On the weak limit of the largest eigenvalue of a large dimensional sample covariance matrix," Journal of Multivariate Analysis, Elsevier, vol. 30(2), pages 307-311, August.
    37. Romain Allez & Jean-Philippe Bouchaud, 2012. "Eigenvector dynamics: general theory and some applications," Papers 1203.6228, arXiv.org, revised Jul 2012.
    38. R. Chicheportiche & J.-P. Bouchaud, 2015. "A nested factor model for non-linear dependencies in stock returns," Quantitative Finance, Taylor & Francis Journals, vol. 15(11), pages 1789-1804, November.
    39. Alexei Onatski, 2010. "Determining the Number of Factors from Empirical Distribution of Eigenvalues," The Review of Economics and Statistics, MIT Press, vol. 92(4), pages 1004-1016, November.
    40. G. Pan & J. Gao & Y. Yang & M. Guo, 2012. "Independence Test for High Dimensional Random Vectors," Monash Econometrics and Business Statistics Working Papers 1/12, Monash University, Department of Econometrics and Business Statistics.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Soufiane Hayou, 2017. "On the overestimation of the largest eigenvalue of a covariance matrix," Papers 1708.03551, arXiv.org.
    2. Longfeng Zhao & Wei Li & Andrea Fenu & Boris Podobnik & Yougui Wang & H. Eugene Stanley, 2017. "The q-dependent detrended cross-correlation analysis of stock market," Papers 1705.01406, arXiv.org, revised Jun 2017.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1610.08104. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (arXiv administrators). General contact details of provider: http://arxiv.org/ .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.