IDEAS home Printed from https://ideas.repec.org/a/eee/phsmap/v657y2025ics0378437124007349.html
   My bibliography  Save this article

Quantifying the information lost in optimal covariance matrix cleaning

Author

Listed:
  • Bongiorno, Christian
  • Lamrani, Lamia

Abstract

Obtaining an accurate estimate of the underlying covariance matrix from finite sample size data is challenging due to sample size noise. In recent years, sophisticated covariance-cleaning techniques based on random matrix theory have been proposed to address this issue. Most of these methods aim to achieve an optimal covariance matrix estimator by minimizing the Frobenius norm distance as a measure of the discrepancy between the true covariance matrix and the estimator. However, this practice offers limited interpretability in terms of information theory. To better understand this relationship, we focus on the Kullback–Leibler divergence to quantify the information lost by the estimator. Our analysis centers on rotationally invariant estimators, which are state-of-art in random matrix theory, and we derive an analytical expression for their Kullback–Leibler divergence. Due to the intricate nature of the calculations, we use genetic programming regressors paired with human intuition. Ultimately, using this approach, we formulate a conjecture validated through extensive simulations, showing that the Frobenius distance corresponds to a first-order expansion term of the Kullback–Leibler divergence, thus establishing a more defined link between the two measures.

Suggested Citation

  • Bongiorno, Christian & Lamrani, Lamia, 2025. "Quantifying the information lost in optimal covariance matrix cleaning," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 657(C).
  • Handle: RePEc:eee:phsmap:v:657:y:2025:i:c:s0378437124007349
    DOI: 10.1016/j.physa.2024.130225
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0378437124007349
    Download Restriction: Full text for ScienceDirect subscribers only. Journal offers the option of making the article available online on Science direct for a fee of $3,000

    File URL: https://libkey.io/10.1016/j.physa.2024.130225?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Christian Bongiorno & Damien Challet, 2021. "Covariance matrix filtering with bootstrapped hierarchies," PLOS ONE, Public Library of Science, vol. 16(1), pages 1-13, January.
    2. Olivier Ledoit & Michael Wolf, 2022. "The Power of (Non-)Linear Shrinking: A Review and Guide to Covariance Matrix Estimation [Design-Free Estimation of Variance Matrices]," Journal of Financial Econometrics, Oxford University Press, vol. 20(1), pages 187-218.
    3. Ledoit, Olivier & Wolf, Michael, 2017. "Numerical implementation of the QuEST function," Computational Statistics & Data Analysis, Elsevier, vol. 115(C), pages 199-223.
    4. Meng, Rui & Yang, Fan & Kim, Won Hwa, 2023. "Dynamic covariance estimation via predictive Wishart process with an application on brain connectivity estimation," Computational Statistics & Data Analysis, Elsevier, vol. 185(C).
    5. Joël Bun & Jean-Philippe Bouchaud & Marc Potters, 2017. "Cleaning large correlation matrices: tools from random matrix theory," Post-Print hal-01491304, HAL.
    6. Contreras-Reyes, Javier E., 2014. "Asymptotic form of the Kullback–Leibler divergence for multivariate asymmetric heavy-tailed distributions," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 395(C), pages 200-208.
    7. Christian Bongiorno & Damien Challet, 2022. "Reactive global minimum variance portfolios with k-BAHC covariance cleaning," The European Journal of Finance, Taylor & Francis Journals, vol. 28(13-15), pages 1344-1360, October.
    8. Touloumis, Anestis, 2015. "Nonparametric Stein-type shrinkage covariance matrix estimators in high-dimensional settings," Computational Statistics & Data Analysis, Elsevier, vol. 83(C), pages 251-261.
    9. Gianluca De Nard & Olivier Ledoit & Michael Wolf, 2021. "Factor Models for Portfolio Selection in Large Dimensions: The Good, the Better and the Ugly [Using Principal Component Analysis to Estimate a High Dimensional Factor Model with High-frequency Data," Journal of Financial Econometrics, Oxford University Press, vol. 19(2), pages 236-257.
    10. Bigot, Jérémie & Deledalle, Charles, 2022. "Low-rank matrix denoising for count data using unbiased Kullback-Leibler risk estimation," Computational Statistics & Data Analysis, Elsevier, vol. 169(C).
    11. M. Tumminello & F. Lillo & R. N. Mantegna, 2007. "Shrinkage and spectral filtering of correlation matrices: a comparison via the Kullback-Leibler distance," Papers 0710.0576, arXiv.org.
    12. repec:dau:papers:123456789/11431 is not listed on IDEAS
    13. Christian Bongiorno & Marco Berritta, 2023. "Optimal Covariance Cleaning for Heavy-Tailed Distributions: Insights from Information Theory," Papers 2304.14098, arXiv.org, revised Apr 2023.
    14. Bongiorno, Christian & Challet, Damien, 2023. "Non-linear shrinkage of the price return covariance matrix is far from optimal for portfolio optimization," Finance Research Letters, Elsevier, vol. 52(C).
    15. Fama, Eugene F. & French, Kenneth R., 1993. "Common risk factors in the returns on stocks and bonds," Journal of Financial Economics, Elsevier, vol. 33(1), pages 3-56, February.
    16. Michele Tumminello & Fabrizio Lillo & Rosario Nunzio Mantegna, 2007. "Kullback-Leibler distance as a measure of the information filtered from multivariate data," Papers 0706.0168, arXiv.org.
    17. Fan, Jianqing & Fan, Yingying & Lv, Jinchi, 2008. "High dimensional covariance matrix estimation using a factor model," Journal of Econometrics, Elsevier, vol. 147(1), pages 186-197, November.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mörstedt, Torsten & Lutz, Bernhard & Neumann, Dirk, 2024. "Cross validation based transfer learning for cross-sectional non-linear shrinkage: A data-driven approach in portfolio optimization," European Journal of Operational Research, Elsevier, vol. 318(2), pages 670-685.
    2. Jin Yuan & Xianghui Yuan, 2023. "A Best Linear Empirical Bayes Method for High-Dimensional Covariance Matrix Estimation," SAGE Open, , vol. 13(2), pages 21582440231, June.
    3. Anatolyev, Stanislav & Pyrlik, Vladimir, 2022. "Copula shrinkage and portfolio allocation in ultra-high dimensions," Journal of Economic Dynamics and Control, Elsevier, vol. 143(C).
    4. De Nard, Gianluca & Zhao, Zhao, 2023. "Using, taming or avoiding the factor zoo? A double-shrinkage estimator for covariance matrices," Journal of Empirical Finance, Elsevier, vol. 72(C), pages 23-35.
    5. Jianqing Fan & Donggyu Kim & Minseok Shin & Yazhen Wang, 2024. "Factor and Idiosyncratic VAR-Ito Volatility Models for Heavy-Tailed High-Frequency Financial Data," Working Papers 202415, University of California at Riverside, Department of Economics.
    6. Lamia Lamrani & Christian Bongiorno & Marc Potters, 2025. "Optimal Data Splitting for Holdout Cross-Validation in Large Covariance Matrix Estimation," Papers 2503.15186, arXiv.org.
    7. Esra Ulasan & A. Özlem Önder, 2023. "Large portfolio optimisation approaches," Journal of Asset Management, Palgrave Macmillan, vol. 24(6), pages 485-497, October.
    8. Bongiorno, Christian & Challet, Damien, 2023. "Non-linear shrinkage of the price return covariance matrix is far from optimal for portfolio optimization," Finance Research Letters, Elsevier, vol. 52(C).
    9. Ikeda, Yuki & Kubokawa, Tatsuya, 2016. "Linear shrinkage estimation of large covariance matrices using factor models," Journal of Multivariate Analysis, Elsevier, vol. 152(C), pages 61-81.
    10. Wu, Yunlin & Huang, Lei & Jiang, Hui, 2023. "Optimization of large portfolio allocation for new-energy stocks: Evidence from China," Energy, Elsevier, vol. 285(C).
    11. Yan Zhang & Jiyuan Tao & Zhixiang Yin & Guoqiang Wang, 2022. "Improved Large Covariance Matrix Estimation Based on Efficient Convex Combination and Its Application in Portfolio Optimization," Mathematics, MDPI, vol. 10(22), pages 1-15, November.
    12. Rafael Alves & Diego S. de Brito & Marcelo C. Medeiros & Ruy M. Ribeiro, 2023. "Forecasting Large Realized Covariance Matrices: The Benefits of Factor Models and Shrinkage," Papers 2303.16151, arXiv.org.
    13. De Nard, Gianluca & Zhao, Zhao, 2022. "A large-dimensional test for cross-sectional anomalies:Efficient sorting revisited," International Review of Economics & Finance, Elsevier, vol. 80(C), pages 654-676.
    14. Fan, Qingliang & Wu, Ruike & Yang, Yanrong & Zhong, Wei, 2024. "Time-varying minimum variance portfolio," Journal of Econometrics, Elsevier, vol. 239(2).
    15. Joel Bun & Jean-Philippe Bouchaud & Marc Potters, 2016. "Cleaning large correlation matrices: tools from random matrix theory," Papers 1610.08104, arXiv.org.
    16. Fan, Jianqing & Liao, Yuan & Shi, Xiaofeng, 2015. "Risks of large portfolios," Journal of Econometrics, Elsevier, vol. 186(2), pages 367-387.
    17. Olivier Ledoit & Michael Wolf, 2022. "Markowitz portfolios under transaction costs," ECON - Working Papers 420, Department of Economics - University of Zurich, revised Sep 2024.
    18. Tae-Hwy Lee & Ekaterina Seregina, 2024. "Optimal Portfolio Using Factor Graphical Lasso," Journal of Financial Econometrics, Oxford University Press, vol. 22(3), pages 670-695.
    19. Gianluca De Nard & Robert F. Engle & Bryan Kelly, 2024. "Factor-Mimicking Portfolios for Climate Risk," Financial Analysts Journal, Taylor & Francis Journals, vol. 80(3), pages 37-58, July.
    20. Yilie Huang & Yanwei Jia & Xun Yu Zhou, 2024. "Mean--Variance Portfolio Selection by Continuous-Time Reinforcement Learning: Algorithms, Regret Analysis, and Empirical Study," Papers 2412.16175, arXiv.org.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:phsmap:v:657:y:2025:i:c:s0378437124007349. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.journals.elsevier.com/physica-a-statistical-mechpplications/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.