IDEAS home Printed from https://ideas.repec.org/a/spr/indpam/v54y2023i3d10.1007_s13226-022-00313-x.html
   My bibliography  Save this article

What is the gradient of a scalar function of a symmetric matrix?

Author

Listed:
  • Shriram Srinivasan

    (Los Alamos National Laboratory)

  • Nishant Panda

    (Los Alamos National Laboratory)

Abstract

For a real valued function $$\phi $$ ϕ of a matrix argument, the gradient $$\nabla \phi $$ ∇ ϕ is calculated using a standard approach that follows from the definition of a Fréchet derivative for matrix functionals. In cases where the matrix argument is restricted to the space of symmetric matrices, the approach is easily modified to determine that the gradient ought to be $$(\nabla \phi + \nabla \phi ^T)/2$$ ( ∇ ϕ + ∇ ϕ T ) / 2 . However, perusal of research articles in the statistics and electrical engineering communities that deal with the topic of matrix calculus reveal a different approach that leads to a spurious result. In this approach, the gradient of $$\phi $$ ϕ is evaluated by explicitly taking into account the symmetry of the matrix, and this “symmetric gradient" $$\nabla \phi _{sym}$$ ∇ ϕ sym is reported to be related to the gradient $$\nabla \phi $$ ∇ ϕ which is computed by ignoring symmetry as $$\nabla \phi _{sym}= \nabla \phi + \nabla \phi ^T - \nabla \phi \circ I$$ ∇ ϕ sym = ∇ ϕ + ∇ ϕ T - ∇ ϕ ∘ I , where $$\circ $$ ∘ denotes the elementwise Hadamard product of the two matrices and I the identity matrix of the same size as $$\nabla \phi $$ ∇ ϕ . The idea of the “symmetric gradient" has now appeared in several publications, as well as in textbooks and handbooks on matrix calculus which are often cited in this context. One of our important contributions has been to wade through the vague and confusing proofs of the result based on matrix calculus and cast the calculation of the “symmetric gradient” in a rigorous and concrete mathematical setting. After setting up the problem in a finite-dimensional inner-product space, we demonstrate rigorously that $$\nabla \phi _{sym}= (\nabla \phi + \nabla \phi ^T)/2$$ ∇ ϕ sym = ( ∇ ϕ + ∇ ϕ T ) / 2 is the correct relationship. Moreover, our derivation exposes that it is an incorrect lifting from the Euclidean space to the space of symmetric matrices, inconsistent with the underlying inner-product, that leads to the spurious result. We also discuss the implications of using the spurious gradient in different classes of problems, such as those where the gradient itself may be the quantity sought, or as part of an optimization algorithm such as gradient descent. We show that the spurious gradient has a relative error of 100% in the off-diagonal components, which makes it an egregious error if the gradient be a quantity of interest, but fortuitously, it proves to be an ascent direction, so that its use in gradient descent may not lead to major issues.

Suggested Citation

  • Shriram Srinivasan & Nishant Panda, 2023. "What is the gradient of a scalar function of a symmetric matrix?," Indian Journal of Pure and Applied Mathematics, Springer, vol. 54(3), pages 907-919, September.
  • Handle: RePEc:spr:indpam:v:54:y:2023:i:3:d:10.1007_s13226-022-00313-x
    DOI: 10.1007/s13226-022-00313-x
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s13226-022-00313-x
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s13226-022-00313-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Magnus, J.R. & Neudecker, H., 1980. "The elimination matrix : Some lemmas and applications," Other publications TiSEM 0e3315d3-846c-4bc5-928e-f, Tilburg University, School of Economics and Management.
    2. Friedrich Gebhardt, 1971. "Maximum likelihood solution to factor analysis when some factors are completely specified," Psychometrika, Springer;The Psychometric Society, vol. 36(2), pages 155-163, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Magnus, Jan R., 2007. "The Asymptotic Variance Of The Pseudo Maximum Likelihood Estimator," Econometric Theory, Cambridge University Press, vol. 23(5), pages 1022-1032, October.
    2. Ronald S. Burt, 1973. "Confirmatory Factor-Analytic Structures and the Theory Construction Process," Sociological Methods & Research, , vol. 2(2), pages 131-190, November.
    3. D.A. Turkington, 1997. "Some results in matrix calculus and an example of their application to econometrics," Economics Discussion / Working Papers 97-07, The University of Western Australia, Department of Economics.
    4. Seok Young Hong & Oliver Linton & Hui Jun Zhang, 2014. "Multivariate variance ratio statistics," CeMMAP working papers 29/14, Institute for Fiscal Studies.
    5. Liu, Shuangzhe & Leiva, Víctor & Zhuang, Dan & Ma, Tiefeng & Figueroa-Zúñiga, Jorge I., 2022. "Matrix differential calculus with applications in the multivariate linear model and its diagnostics," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    6. Bollerslev, Tim & Patton, Andrew J. & Quaedvlieg, Rogier, 2018. "Modeling and forecasting (un)reliable realized covariances for more reliable financial decisions," Journal of Econometrics, Elsevier, vol. 207(1), pages 71-91.
    7. Savas Papadopoulos, 2010. "Theory and methodology for dynamic panel data: tested by simulations based on financial data," International Journal of Computational Economics and Econometrics, Inderscience Enterprises Ltd, vol. 1(3/4), pages 239-253.
    8. Seok Young Hong & Oliver Linton & Hui Jun Zhang, 2014. "Multivariate Variance Ratio Statistics," Cambridge Working Papers in Economics 1459, Faculty of Economics, University of Cambridge.
    9. Attfield, C. L. F., 1995. "A Bartlett adjustment to the likelihood ratio test for a system of equations," Journal of Econometrics, Elsevier, vol. 66(1-2), pages 207-223.
    10. P. C. B. Phillips & S. N. Durlauf, 1986. "Multiple Time Series Regression with Integrated Processes," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 53(4), pages 473-495.
    11. Seok Young Hong & Oliver Linton & Hui Jun Zhang, 2015. "An investigation into Multivariate Variance Ratio Statistics and their application to Stock Market Predictability," Cambridge Working Papers in Economics 1552, Faculty of Economics, University of Cambridge.
    12. Eduardo Abi Jaber & Bruno Bouchard & Camille Illand & Eduardo Jaber, 2018. "Stochastic invariance of closed sets with non-Lipschitz coefficients," Working Papers hal-01349639, HAL.
    13. Monfort, Alain & Renne, Jean-Paul & Roussellet, Guillaume, 2015. "A Quadratic Kalman Filter," Journal of Econometrics, Elsevier, vol. 187(1), pages 43-56.
    14. Eduardo Abi Jaber & Bruno Bouchard & Camille Illand & Eduardo Abi Jaber, 2018. "Stochastic invariance of closed sets with non-Lipschitz coefficients," Post-Print hal-01349639, HAL.
    15. Shi, Jianhong & Bai, Xiuqin & Song, Weixing, 2020. "Nonparametric regression estimate with Berkson Laplace measurement error," Statistics & Probability Letters, Elsevier, vol. 166(C).
    16. Kolesár, Michal, 2018. "Minimum distance approach to inference with many instruments," Journal of Econometrics, Elsevier, vol. 204(1), pages 86-100.
    17. Lo, Andrew W. & Mackinlay, A. Craig, 1997. "Maximizing Predictability In The Stock And Bond Markets," Macroeconomic Dynamics, Cambridge University Press, vol. 1(1), pages 102-134, January.
    18. Seok Young Hong & Oliver Linton & Hui Jun Zhang, 2015. "An investigation into multivariate variance ratio statistics and their application to stock market predictability," CeMMAP working papers 13/15, Institute for Fiscal Studies.
    19. Chun-Lung Su, 2021. "Bayesian multi-way balanced nested MANOVA models with random effects and a large number of the main factor levels," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 84(5), pages 663-692, July.
    20. Phillips, P. C. B., 1987. "Asymptotic Expansions in Nonstationary Vector Autoregressions," Econometric Theory, Cambridge University Press, vol. 3(1), pages 45-68, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:indpam:v:54:y:2023:i:3:d:10.1007_s13226-022-00313-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.