IDEAS home Printed from https://ideas.repec.org/a/spr/indpam/v54y2023i3d10.1007_s13226-022-00313-x.html
   My bibliography  Save this article

What is the gradient of a scalar function of a symmetric matrix?

Author

Listed:
  • Shriram Srinivasan

    (Los Alamos National Laboratory)

  • Nishant Panda

    (Los Alamos National Laboratory)

Abstract

For a real valued function $$\phi $$ ϕ of a matrix argument, the gradient $$\nabla \phi $$ ∇ ϕ is calculated using a standard approach that follows from the definition of a Fréchet derivative for matrix functionals. In cases where the matrix argument is restricted to the space of symmetric matrices, the approach is easily modified to determine that the gradient ought to be $$(\nabla \phi + \nabla \phi ^T)/2$$ ( ∇ ϕ + ∇ ϕ T ) / 2 . However, perusal of research articles in the statistics and electrical engineering communities that deal with the topic of matrix calculus reveal a different approach that leads to a spurious result. In this approach, the gradient of $$\phi $$ ϕ is evaluated by explicitly taking into account the symmetry of the matrix, and this “symmetric gradient" $$\nabla \phi _{sym}$$ ∇ ϕ sym is reported to be related to the gradient $$\nabla \phi $$ ∇ ϕ which is computed by ignoring symmetry as $$\nabla \phi _{sym}= \nabla \phi + \nabla \phi ^T - \nabla \phi \circ I$$ ∇ ϕ sym = ∇ ϕ + ∇ ϕ T - ∇ ϕ ∘ I , where $$\circ $$ ∘ denotes the elementwise Hadamard product of the two matrices and I the identity matrix of the same size as $$\nabla \phi $$ ∇ ϕ . The idea of the “symmetric gradient" has now appeared in several publications, as well as in textbooks and handbooks on matrix calculus which are often cited in this context. One of our important contributions has been to wade through the vague and confusing proofs of the result based on matrix calculus and cast the calculation of the “symmetric gradient” in a rigorous and concrete mathematical setting. After setting up the problem in a finite-dimensional inner-product space, we demonstrate rigorously that $$\nabla \phi _{sym}= (\nabla \phi + \nabla \phi ^T)/2$$ ∇ ϕ sym = ( ∇ ϕ + ∇ ϕ T ) / 2 is the correct relationship. Moreover, our derivation exposes that it is an incorrect lifting from the Euclidean space to the space of symmetric matrices, inconsistent with the underlying inner-product, that leads to the spurious result. We also discuss the implications of using the spurious gradient in different classes of problems, such as those where the gradient itself may be the quantity sought, or as part of an optimization algorithm such as gradient descent. We show that the spurious gradient has a relative error of 100% in the off-diagonal components, which makes it an egregious error if the gradient be a quantity of interest, but fortuitously, it proves to be an ascent direction, so that its use in gradient descent may not lead to major issues.

Suggested Citation

  • Shriram Srinivasan & Nishant Panda, 2023. "What is the gradient of a scalar function of a symmetric matrix?," Indian Journal of Pure and Applied Mathematics, Springer, vol. 54(3), pages 907-919, September.
  • Handle: RePEc:spr:indpam:v:54:y:2023:i:3:d:10.1007_s13226-022-00313-x
    DOI: 10.1007/s13226-022-00313-x
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s13226-022-00313-x
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s13226-022-00313-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Magnus, J.R. & Neudecker, H., 1980. "The elimination matrix : Some lemmas and applications," Other publications TiSEM 0e3315d3-846c-4bc5-928e-f, Tilburg University, School of Economics and Management.
    2. Friedrich Gebhardt, 1971. "Maximum likelihood solution to factor analysis when some factors are completely specified," Psychometrika, Springer;The Psychometric Society, vol. 36(2), pages 155-163, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Seok Young Hong & Oliver Linton & Hui Jun Zhang, 2014. "Multivariate variance ratio statistics," CeMMAP working papers 29/14, Institute for Fiscal Studies.
    2. Liu, Shuangzhe & Leiva, Víctor & Zhuang, Dan & Ma, Tiefeng & Figueroa-Zúñiga, Jorge I., 2022. "Matrix differential calculus with applications in the multivariate linear model and its diagnostics," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    3. Bollerslev, Tim & Patton, Andrew J. & Quaedvlieg, Rogier, 2018. "Modeling and forecasting (un)reliable realized covariances for more reliable financial decisions," Journal of Econometrics, Elsevier, vol. 207(1), pages 71-91.
    4. Savas Papadopoulos, 2010. "Theory and methodology for dynamic panel data: tested by simulations based on financial data," International Journal of Computational Economics and Econometrics, Inderscience Enterprises Ltd, vol. 1(3/4), pages 239-253.
    5. Seok Young Hong & Oliver Linton & Hui Jun Zhang, 2014. "Multivariate Variance Ratio Statistics," Cambridge Working Papers in Economics 1459, Faculty of Economics, University of Cambridge.
    6. P. C. B. Phillips & S. N. Durlauf, 1986. "Multiple Time Series Regression with Integrated Processes," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 53(4), pages 473-495.
    7. Seok Young Hong & Oliver Linton & Hui Jun Zhang, 2015. "An investigation into Multivariate Variance Ratio Statistics and their application to Stock Market Predictability," Cambridge Working Papers in Economics 1552, Faculty of Economics, University of Cambridge.
    8. Eduardo Abi Jaber & Bruno Bouchard & Camille Illand & Eduardo Jaber, 2018. "Stochastic invariance of closed sets with non-Lipschitz coefficients," Working Papers hal-01349639, HAL.
    9. Shi, Jianhong & Bai, Xiuqin & Song, Weixing, 2020. "Nonparametric regression estimate with Berkson Laplace measurement error," Statistics & Probability Letters, Elsevier, vol. 166(C).
    10. Lo, Andrew W. & Mackinlay, A. Craig, 1997. "Maximizing Predictability In The Stock And Bond Markets," Macroeconomic Dynamics, Cambridge University Press, vol. 1(1), pages 102-134, January.
    11. Seok Young Hong & Oliver Linton & Hui Jun Zhang, 2015. "An investigation into multivariate variance ratio statistics and their application to stock market predictability," CeMMAP working papers 13/15, Institute for Fiscal Studies.
    12. Shanshan Hu & Yongxin Yuan, 2023. "Common Solutions to the Matrix Equations $$AX=B$$ A X = B and $$XC=D$$ X C = D on a Subspace," Journal of Optimization Theory and Applications, Springer, vol. 198(1), pages 372-386, July.
    13. Phillips, P. C. B., 1987. "Asymptotic Expansions in Nonstationary Vector Autoregressions," Econometric Theory, Cambridge University Press, vol. 3(1), pages 45-68, February.
    14. Steven E. Pav, 2013. "Asymptotic distribution of the Markowitz portfolio," Papers 1312.0557, arXiv.org, revised Mar 2020.
    15. Qingliang Fan & Zijian Guo & Ziwei Mei, 2022. "A Heteroskedasticity-Robust Overidentifying Restriction Test with High-Dimensional Covariates," Papers 2205.00171, arXiv.org, revised May 2024.
    16. Christian Gische & Manuel C. Voelkle, 2022. "Beyond the Mean: A Flexible Framework for Studying Causal Effects Using Linear Models," Psychometrika, Springer;The Psychometric Society, vol. 87(3), pages 868-901, September.
    17. Turkington, Darrell A., 1998. "Efficient estimation in the linear simultaneous equations model with vector autoregressive disturbances," Journal of Econometrics, Elsevier, vol. 85(1), pages 51-74, July.
    18. Magnus, Jan R., 2007. "The Asymptotic Variance Of The Pseudo Maximum Likelihood Estimator," Econometric Theory, Cambridge University Press, vol. 23(5), pages 1022-1032, October.
    19. J. Chacón & T. Duong, 2010. "Multivariate plug-in bandwidth selection with unconstrained pilot bandwidth matrices," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 19(2), pages 375-398, August.
    20. Armin Schwartzman, 2016. "Lognormal Distributions and Geometric Averages of Symmetric Positive Definite Matrices," International Statistical Review, International Statistical Institute, vol. 84(3), pages 456-486, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:indpam:v:54:y:2023:i:3:d:10.1007_s13226-022-00313-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.