IDEAS home Printed from https://ideas.repec.org/a/spr/sankha/v79y2017i2d10.1007_s13171-017-0106-6.html
   My bibliography  Save this article

New Asymptotic Results in Principal Component Analysis

Author

Listed:
  • Vladimir Koltchinskii

    (Georgia Institute of Technology)

  • Karim Lounici

    (Georgia Institute of Technology
    Université Côte d’Azur)

Abstract

Let X be a mean zero Gaussian random vector in a separable Hilbert space ℍ ${\mathbb H}$ with covariance operator Σ : = E ( X ⊗ X ) . ${\Sigma }:={\mathbb E}(X\otimes X).$ Let Σ = ∑ r ≥ 1 μ r P r ${\Sigma }={\sum }_{r\geq 1}\mu _{r} P_{r}$ be the spectral decomposition of Σ with distinct eigenvalues μ 1 > μ 2 > … $\mu _{1}>\mu _{2}> \dots $ and the corresponding spectral projectors P 1 , P 2 , … . $P_{1}, P_{2}, \dots .$ Given a sample X 1 , … , X n $X_{1},\dots , X_{n}$ of size n of i.i.d. copies of X, the sample covariance operator is defined as Σ ̂ n : = n − 1 ∑ j = 1 n X j ⊗ X j . $\hat {\Sigma }_{n} := n^{-1}{\sum }_{j=1}^{n} X_{j}\otimes X_{j}.$ The main goal of principal component analysis is to estimate spectral projectors P 1 , P 2 , … $P_{1}, P_{2}, \dots $ by their empirical counterparts P ̂ 1 , P ̂ 2 , … $\hat P_{1}, \hat P_{2}, \dots $ properly defined in terms of spectral decomposition of the sample covariance operator Σ ̂ n . $\hat {\Sigma }_{n}.$ The aim of this paper is to study asymptotic distributions of important statistics related to this problem, in particular, of statistic ∥ P ̂ r − P r ∥ 2 2 , $\|\hat P_{r}-P_{r}\|_{2}^{2},$ where ∥ ⋅ ∥ 2 2 $\|\cdot \|_{2}^{2}$ is the squared Hilbert–Schmidt norm. This is done in a “high-complexity” asymptotic framework in which the so called effective rank r ( Σ ) : = tr ( Σ ) ∥ Σ ∥ ∞ $\textbf {r}({\Sigma }):=\frac {\text {tr}({\Sigma })}{\|{\Sigma }\|_{\infty }}$ (tr(⋅) being the trace and ∥ ⋅ ∥ ∞ $\|\cdot \|_{\infty }$ being the operator norm) of the true covariance Σ is becoming large simultaneously with the sample size n, but r(Σ) = o(n) as n → ∞ . $n\to \infty .$ In this setting, we prove that, in the case of one-dimensional spectral projector P r , the properly centered and normalized statistic ∥ P ̂ r − P r ∥ 2 2 $\|\hat P_{r}-P_{r}\|_{2}^{2}$ with data-dependent centering and normalization converges in distribution to a Cauchy type limit. The proofs of this and other related results rely on perturbation analysis and Gaussian concentration.

Suggested Citation

  • Vladimir Koltchinskii & Karim Lounici, 2017. "New Asymptotic Results in Principal Component Analysis," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 79(2), pages 254-297, August.
  • Handle: RePEc:spr:sankha:v:79:y:2017:i:2:d:10.1007_s13171-017-0106-6
    DOI: 10.1007/s13171-017-0106-6
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s13171-017-0106-6
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s13171-017-0106-6?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Johnstone, Iain M. & Lu, Arthur Yu, 2009. "On Consistency and Sparsity for Principal Components Analysis in High Dimensions," Journal of the American Statistical Association, American Statistical Association, vol. 104(486), pages 682-693.
    2. Dauxois, J. & Pousse, A. & Romain, Y., 1982. "Asymptotic theory for the principal component analysis of a vector random function: Some applications to statistical inference," Journal of Multivariate Analysis, Elsevier, vol. 12(1), pages 136-154, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Mike Ludkovski & Glen Swindle & Eric Grannan, 2022. "Large Scale Probabilistic Simulation of Renewables Production," Papers 2205.04736, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Puyi Fang & Zhaoxing Gao & Ruey S. Tsay, 2023. "Determination of the effective cointegration rank in high-dimensional time-series predictive regressions," Papers 2304.12134, arXiv.org, revised Apr 2023.
    2. Candelon, B. & Hurlin, C. & Tokpavi, S., 2012. "Sampling error and double shrinkage estimation of minimum variance portfolios," Journal of Empirical Finance, Elsevier, vol. 19(4), pages 511-527.
    3. Fan, Jianqing & Jiang, Bai & Sun, Qiang, 2022. "Bayesian factor-adjusted sparse regression," Journal of Econometrics, Elsevier, vol. 230(1), pages 3-19.
    4. Mingotti, Nicola & Lillo Rodríguez, Rosa Elvira & Romo, Juan, 2015. "A Random Walk Test for Functional Time Series," DES - Working Papers. Statistics and Econometrics. WS ws1506, Universidad Carlos III de Madrid. Departamento de Estadística.
    5. Yata, Kazuyoshi & Aoshima, Makoto, 2013. "PCA consistency for the power spiked model in high-dimensional settings," Journal of Multivariate Analysis, Elsevier, vol. 122(C), pages 334-354.
    6. Asai, Manabu & McAleer, Michael, 2015. "Forecasting co-volatilities via factor models with asymmetry and long memory in realized covariance," Journal of Econometrics, Elsevier, vol. 189(2), pages 251-262.
    7. María Edo & Walter Sosa Escudero & Marcela Svarc, 2021. "A multidimensional approach to measuring the middle class," The Journal of Economic Inequality, Springer;Society for the Study of Economic Inequality, vol. 19(1), pages 139-162, March.
    8. Jarry, Gabriel & Delahaye, Daniel & Nicol, Florence & Feron, Eric, 2020. "Aircraft atypical approach detection using functional principal component analysis," Journal of Air Transport Management, Elsevier, vol. 84(C).
    9. Guangxing Wang & Sisheng Liu & Fang Han & Chong‐Zhi Di, 2023. "Robust functional principal component analysis via a functional pairwise spatial sign operator," Biometrics, The International Biometric Society, vol. 79(2), pages 1239-1253, June.
    10. Michal Benko & Wolfgang Härdle & Alois Kneip, 2006. "Common Functional Principal Components," SFB 649 Discussion Papers SFB649DP2006-010, Sonderforschungsbereich 649, Humboldt University, Berlin, Germany.
    11. Maillet, Bertrand & Tokpavi, Sessi & Vaucher, Benoit, 2015. "Global minimum variance portfolio optimisation under some model risk: A robust regression-based approach," European Journal of Operational Research, Elsevier, vol. 244(1), pages 289-299.
    12. Wang, Shao-Hsuan & Huang, Su-Yun, 2022. "Perturbation theory for cross data matrix-based PCA," Journal of Multivariate Analysis, Elsevier, vol. 190(C).
    13. Namvar, Ethan & Phillips, Blake & Pukthuanthong, Kuntara & Raghavendra Rau, P., 2016. "Do hedge funds dynamically manage systematic risk?," Journal of Banking & Finance, Elsevier, vol. 64(C), pages 1-15.
    14. Li, Weiming & Gao, Jing & Li, Kunpeng & Yao, Qiwei, 2016. "Modelling multivariate volatilities via latent common factors," LSE Research Online Documents on Economics 68121, London School of Economics and Political Science, LSE Library.
    15. Silin, Igor & Spokoiny, Vladimir, 2018. "Bayesian inference for spectral projectors of covariance matrix," IRTG 1792 Discussion Papers 2018-027, Humboldt University of Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series".
    16. van Delft, Anne, 2020. "A note on quadratic forms of stationary functional time series under mild conditions," Stochastic Processes and their Applications, Elsevier, vol. 130(7), pages 4206-4251.
    17. Bali, Juan Lucas & Boente, Graciela, 2015. "Influence function of projection-pursuit principal components for functional data," Journal of Multivariate Analysis, Elsevier, vol. 133(C), pages 173-199.
    18. Barigozzi, Matteo & Trapani, Lorenzo, 2020. "Sequential testing for structural stability in approximate factor models," Stochastic Processes and their Applications, Elsevier, vol. 130(8), pages 5149-5187.
    19. Qi, Xin & Zhao, Hongyu, 2011. "Some theoretical properties of Silverman's method for Smoothed functional principal component analysis," Journal of Multivariate Analysis, Elsevier, vol. 102(4), pages 741-767, April.
    20. Ci-Ren Jiang & John A. D. Aston & Jane-Ling Wang, 2016. "A Functional Approach to Deconvolve Dynamic Neuroimaging Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(513), pages 1-13, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:sankha:v:79:y:2017:i:2:d:10.1007_s13171-017-0106-6. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.