IDEAS home Printed from https://ideas.repec.org/p/arx/papers/1610.08104.html
   My bibliography  Save this paper

Cleaning large correlation matrices: tools from random matrix theory

Author

Listed:
  • Joel Bun
  • Jean-Philippe Bouchaud
  • Marc Potters

Abstract

This review covers recent results concerning the estimation of large covariance matrices using tools from Random Matrix Theory (RMT). We introduce several RMT methods and analytical techniques, such as the Replica formalism and Free Probability, with an emphasis on the Marchenko-Pastur equation that provides information on the resolvent of multiplicatively corrupted noisy matrices. Special care is devoted to the statistics of the eigenvectors of the empirical correlation matrix, which turn out to be crucial for many applications. We show in particular how these results can be used to build consistent "Rotationally Invariant" estimators (RIE) for large correlation matrices when there is no prior on the structure of the underlying process. The last part of this review is dedicated to some real-world applications within financial markets as a case in point. We establish empirically the efficacy of the RIE framework, which is found to be superior in this case to all previously proposed methods. The case of additively (rather than multiplicatively) corrupted noisy matrices is also dealt with in a special Appendix. Several open problems and interesting technical developments are discussed throughout the paper.

Suggested Citation

  • Joel Bun & Jean-Philippe Bouchaud & Marc Potters, 2016. "Cleaning large correlation matrices: tools from random matrix theory," Papers 1610.08104, arXiv.org.
  • Handle: RePEc:arx:papers:1610.08104
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/1610.08104
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Rémy Chicheportiche & J-P Bouchaud, 2015. "A nested factor model for non-linear dependencies in stock returns," Post-Print hal-01339978, HAL.
    2. Ledoit, Olivier & Wolf, Michael, 2004. "A well-conditioned estimator for large-dimensional covariance matrices," Journal of Multivariate Analysis, Elsevier, vol. 88(2), pages 365-411, February.
    3. Hansen, Lars Peter, 1982. "Large Sample Properties of Generalized Method of Moments Estimators," Econometrica, Econometric Society, vol. 50(4), pages 1029-1054, July.
    4. Ledoit, Olivier & Wolf, Michael, 2017. "Numerical implementation of the QuEST function," Computational Statistics & Data Analysis, Elsevier, vol. 115(C), pages 199-223.
    5. R. Chicheportiche & J.-P. Bouchaud, 2015. "A nested factor model for non-linear dependencies in stock returns," Quantitative Finance, Taylor & Francis Journals, vol. 15(11), pages 1789-1804, November.
    6. Ivailo I. Dimov & Petter N. Kolm & Lee Maclin & Dan Y. C. Shiber, 2012. "Hidden noise structure and random matrix models of stock correlations," Quantitative Finance, Taylor & Francis Journals, vol. 12(4), pages 567-572, November.
    7. Stefano Ciliberti & Imre Kondor & Marc Mezard, 2007. "On the feasibility of portfolio optimization under expected shortfall," Quantitative Finance, Taylor & Francis Journals, vol. 7(4), pages 389-396.
    8. Pafka, Szilárd & Kondor, Imre, 2003. "Noisy covariance matrices and portfolio optimization II," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 319(C), pages 487-494.
    9. J.-P. Bouchaud & L. Laloux & M. A. Miceli & M. Potters, 2007. "Large dimension forecasting models and random singular value spectra," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 55(2), pages 201-207, January.
    10. Laurent Laloux & Pierre Cizeau & Marc Potters & Jean-Philippe Bouchaud, 2000. "Random Matrix Theory And Financial Correlations," International Journal of Theoretical and Applied Finance (IJTAF), World Scientific Publishing Co. Pte. Ltd., vol. 3(03), pages 391-397.
    11. Silverstein, Jack W., 1989. "On the eigenvectors of large dimensional sample covariance matrices," Journal of Multivariate Analysis, Elsevier, vol. 30(1), pages 1-16, July.
    12. Harry Markowitz, 1952. "Portfolio Selection," Journal of Finance, American Finance Association, vol. 7(1), pages 77-91, March.
    13. Chamberlain, Gary & Rothschild, Michael, 1983. "Arbitrage, Factor Structure, and Mean-Variance Analysis on Large Asset Markets," Econometrica, Econometric Society, vol. 51(5), pages 1281-1304, September.
    14. Alexei Onatski, 2010. "Determining the Number of Factors from Empirical Distribution of Eigenvalues," The Review of Economics and Statistics, MIT Press, vol. 92(4), pages 1004-1016, November.
    15. Tumminello, Michele & Lillo, Fabrizio & Mantegna, Rosario N., 2010. "Correlation, hierarchies, and networks in financial markets," Journal of Economic Behavior & Organization, Elsevier, vol. 75(1), pages 40-58, July.
    16. Ledoit, Olivier & Wolf, Michael, 2003. "Improved estimation of the covariance matrix of stock returns with an application to portfolio selection," Journal of Empirical Finance, Elsevier, vol. 10(5), pages 603-621, December.
    17. Benaych-Georges, Florent & Nadakuditi, Raj Rao, 2012. "The singular values and vectors of low rank perturbations of large rectangular random matrices," Journal of Multivariate Analysis, Elsevier, vol. 111(C), pages 120-135.
    18. Romain Allez & Jean-Philippe Bouchaud, 2012. "Eigenvector dynamics: general theory and some applications," Papers 1203.6228, arXiv.org, revised Jul 2012.
    19. Ester Pantaleo & Michele Tumminello & Fabrizio Lillo & Rosario Mantegna, 2011. "When do improved covariance matrix estimators enhance portfolio optimization? An empirical comparative study of nine estimators," Quantitative Finance, Taylor & Francis Journals, vol. 11(7), pages 1067-1080.
    20. George Kapetanios, 2004. "A New Method for Determining the Number of Factors in Factor Models with Large Datasets," Working Papers 525, Queen Mary University of London, School of Economics and Finance.
    21. George Kapetanios, 2004. "A New Method for Determining the Number of Factors in Factor Models with Large Datasets," Working Papers 525, Queen Mary University of London, School of Economics and Finance.
    22. Haff, L. R., 1979. "An identity for the Wishart distribution with applications," Journal of Multivariate Analysis, Elsevier, vol. 9(4), pages 531-544, December.
    23. Burda, Z. & Görlich, A. & Jarosz, A. & Jurkiewicz, J., 2004. "Signal and noise in correlation matrix," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 343(C), pages 295-310.
    24. Couillet, Romain & Pascal, Frédéric & Silverstein, Jack W., 2015. "The random matrix regime of Maronna’s M-estimator with elliptically distributed samples," Journal of Multivariate Analysis, Elsevier, vol. 139(C), pages 56-78.
    25. Ledoit, Olivier & Wolf, Michael, 2015. "Spectrum estimation: A unified framework for covariance matrix estimation and PCA in large dimensions," Journal of Multivariate Analysis, Elsevier, vol. 139(C), pages 360-384.
    26. Matteo Marsili, 2002. "Dissecting financial markets: Sectors and states," Papers cond-mat/0207156, arXiv.org.
    27. Yin, Y. Q., 1986. "Limiting spectral distribution for a class of random matrices," Journal of Multivariate Analysis, Elsevier, vol. 20(1), pages 50-68, October.
    28. Matteo Marsili, 2002. "Dissecting financial markets: sectors and states," Quantitative Finance, Taylor & Francis Journals, vol. 2(4), pages 297-302.
    29. repec:dau:papers:123456789/10916 is not listed on IDEAS
    30. Thilo A. Schmitt & Desislava Chetalova & Rudi Schafer & Thomas Guhr, 2013. "Non-Stationarity in Financial Time Series and Generic Features," Papers 1304.5130, arXiv.org, revised May 2013.
    31. Haff, L. R., 1977. "Minimax estimators for a multinormal precision matrix," Journal of Multivariate Analysis, Elsevier, vol. 7(3), pages 374-385, September.
    32. Merton, Robert C, 1973. "An Intertemporal Capital Asset Pricing Model," Econometrica, Econometric Society, vol. 41(5), pages 867-887, September.
    33. Couillet, Romain & Kammoun, Abla & Pascal, Frédéric, 2016. "Second order statistics of robust estimators of scatter. Application to GLRT detection for elliptical signals," Journal of Multivariate Analysis, Elsevier, vol. 143(C), pages 249-274.
    34. Silverstein, J. W., 1995. "Strong Convergence of the Empirical Distribution of Eigenvalues of Large Dimensional Random Matrices," Journal of Multivariate Analysis, Elsevier, vol. 55(2), pages 331-339, November.
    35. Silverstein, J. W. & Choi, S. I., 1995. "Analysis of the Limiting Spectral Distribution of Large Dimensional Random Matrices," Journal of Multivariate Analysis, Elsevier, vol. 54(2), pages 295-309, August.
    36. Fama, Eugene F. & French, Kenneth R., 1993. "Common risk factors in the returns on stocks and bonds," Journal of Financial Economics, Elsevier, vol. 33(1), pages 3-56, February.
    37. Bouchaud,Jean-Philippe & Potters,Marc, 2003. "Theory of Financial Risk and Derivative Pricing," Cambridge Books, Cambridge University Press, number 9780521819169.
    38. G. Pan & J. Gao & Y. Yang & M. Guo, 2012. "Independence Test for High Dimensional Random Vectors," Monash Econometrics and Business Statistics Working Papers 1/12, Monash University, Department of Econometrics and Business Statistics.
    39. Michele Tumminello & Fabrizio Lillo & Rosario Nunzio Mantegna, 2007. "Kullback-Leibler distance as a measure of the information filtered from multivariate data," Papers 0706.0168, arXiv.org.
    40. Paul, Debashis & Silverstein, Jack W., 2009. "No eigenvalues outside the support of the limiting empirical spectral distribution of a separable covariance matrix," Journal of Multivariate Analysis, Elsevier, vol. 100(1), pages 37-57, January.
    41. Silverstein, J. W. & Bai, Z. D., 1995. "On the Empirical Distribution of Eigenvalues of a Class of Large Dimensional Random Matrices," Journal of Multivariate Analysis, Elsevier, vol. 54(2), pages 175-192, August.
    42. Silverstein, Jack W., 1989. "On the weak limit of the largest eigenvalue of a large dimensional sample covariance matrix," Journal of Multivariate Analysis, Elsevier, vol. 30(2), pages 307-311, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Goldberg, Lisa R & Papanicolaou, Alex & Shkolnik, Alex, 2022. "The Dispersion Bias," Department of Economics, Working Paper Series qt4kt5g2x3, Department of Economics, Institute for Business and Economic Research, UC Berkeley.
    2. Longfeng Zhao & Wei Li & Andrea Fenu & Boris Podobnik & Yougui Wang & H. Eugene Stanley, 2017. "The q-dependent detrended cross-correlation analysis of stock market," Papers 1705.01406, arXiv.org, revised Jun 2017.
    3. Sebastien Valeyre, 2022. "Optimal trend following portfolios," Papers 2201.06635, arXiv.org.
    4. Soufiane Hayou, 2017. "On the overestimation of the largest eigenvalue of a covariance matrix," Papers 1708.03551, arXiv.org.
    5. Jean-Philippe Bouchaud, 2021. "Radical Complexity," Papers 2103.09692, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Robert F. Engle & Olivier Ledoit & Michael Wolf, 2019. "Large Dynamic Covariance Matrices," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 37(2), pages 363-375, April.
    2. Ledoit, Olivier & Wolf, Michael, 2021. "Shrinkage estimation of large covariance matrices: Keep it simple, statistician?," Journal of Multivariate Analysis, Elsevier, vol. 186(C).
    3. Bodnar, Taras & Parolya, Nestor & Schmid, Wolfgang, 2018. "Estimation of the global minimum variance portfolio in high dimensions," European Journal of Operational Research, Elsevier, vol. 266(1), pages 371-390.
    4. Jushan Bai & Shuzhong Shi, 2011. "Estimating High Dimensional Covariance Matrices and its Applications," Annals of Economics and Finance, Society for AEF, vol. 12(2), pages 199-215, November.
    5. Ledoit, Olivier & Wolf, Michael, 2017. "Numerical implementation of the QuEST function," Computational Statistics & Data Analysis, Elsevier, vol. 115(C), pages 199-223.
    6. Bodnar, Olha & Bodnar, Taras & Parolya, Nestor, 2022. "Recent advances in shrinkage-based high-dimensional inference," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    7. Olivier Ledoit & Sandrine P�ch�, 2009. "Eigenvectors of some large sample covariance matrices ensembles," IEW - Working Papers 407, Institute for Empirical Research in Economics - University of Zurich.
    8. Olivier Ledoit & Michael Wolf, 2017. "Analytical nonlinear shrinkage of large-dimensional covariance matrices," ECON - Working Papers 264, Department of Economics - University of Zurich, revised Nov 2018.
    9. Aït-Sahalia, Yacine & Xiu, Dacheng, 2017. "Using principal component analysis to estimate a high dimensional factor model with high-frequency data," Journal of Econometrics, Elsevier, vol. 201(2), pages 384-399.
    10. Zura Kakushadze & Willie Yu, 2016. "Statistical Risk Models," Papers 1602.08070, arXiv.org, revised Jan 2017.
    11. Varga-Haszonits, Istvan & Caccioli, Fabio & Kondor, Imre, 2016. "Replica approach to mean-variance portfolio optimization," LSE Research Online Documents on Economics 68955, London School of Economics and Political Science, LSE Library.
    12. Plachel, Lukas, 2019. "A unified model for regularized and robust portfolio optimization," Journal of Economic Dynamics and Control, Elsevier, vol. 109(C).
    13. Olivier Ledoit & Michael Wolf, 2019. "Quadratic shrinkage for large covariance matrices," ECON - Working Papers 335, Department of Economics - University of Zurich, revised Dec 2020.
    14. Taras Bodnar & Arjun K. Gupta & Nestor Parolya, 2013. "Optimal Linear Shrinkage Estimator for Large Dimensional Precision Matrix," Papers 1308.0931, arXiv.org, revised Mar 2014.
    15. Ikeda, Yuki & Kubokawa, Tatsuya, 2016. "Linear shrinkage estimation of large covariance matrices using factor models," Journal of Multivariate Analysis, Elsevier, vol. 152(C), pages 61-81.
    16. Olivier Ledoit & Michael Wolf, 2019. "Shrinkage estimation of large covariance matrices: keep it simple, statistician?," ECON - Working Papers 327, Department of Economics - University of Zurich, revised Jun 2021.
    17. Couillet, Romain, 2015. "Robust spiked random matrices and a robust G-MUSIC estimator," Journal of Multivariate Analysis, Elsevier, vol. 140(C), pages 139-161.
    18. Istvan Varga-Haszonits & Fabio Caccioli & Imre Kondor, 2016. "Replica approach to mean-variance portfolio optimization," Papers 1606.08679, arXiv.org.
    19. Zura Kakushadze & Willie Yu, 2016. "Multifactor Risk Models and Heterotic CAPM," Papers 1602.04902, arXiv.org, revised Mar 2016.
    20. Couillet, Romain & Kammoun, Abla & Pascal, Frédéric, 2016. "Second order statistics of robust estimators of scatter. Application to GLRT detection for elliptical signals," Journal of Multivariate Analysis, Elsevier, vol. 143(C), pages 249-274.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1610.08104. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.