IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2508.15675.html
   My bibliography  Save this paper

Large-dimensional Factor Analysis with Weighted PCA

Author

Listed:
  • Zhongyuan Lyu
  • Ming Yuan

Abstract

Principal component analysis (PCA) is arguably the most widely used approach for large-dimensional factor analysis. While it is effective when the factors are sufficiently strong, it can be inconsistent when the factors are weak and/or the noise has complex dependence structure. We argue that the inconsistency often stems from bias and introduce a general approach to restore consistency. Specifically, we propose a general weighting scheme for PCA and show that with a suitable choice of weighting matrices, it is possible to deduce consistent and asymptotic normal estimators under much weaker conditions than the usual PCA. While the optimal weight matrix may require knowledge about the factors and covariance of the idiosyncratic noise that are not known a priori, we develop an agnostic approach to adaptively choose from a large class of weighting matrices that can be viewed as PCA for weighted linear combinations of auto-covariances among the observations. Theoretical and numerical results demonstrate the merits of our methodology over the usual PCA and other recently developed techniques for large-dimensional approximate factor models.

Suggested Citation

  • Zhongyuan Lyu & Ming Yuan, 2025. "Large-dimensional Factor Analysis with Weighted PCA," Papers 2508.15675, arXiv.org.
  • Handle: RePEc:arx:papers:2508.15675
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2508.15675
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Michael W. McCracken & Serena Ng, 2016. "FRED-MD: A Monthly Database for Macroeconomic Research," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 574-589, October.
    2. Chamberlain, Gary & Rothschild, Michael, 1983. "Arbitrage, Factor Structure, and Mean-Variance Analysis on Large Asset Markets," Econometrica, Econometric Society, vol. 51(5), pages 1281-1304, September.
    3. Bai, Jushan & Ng, Serena, 2023. "Approximate factor models with weaker loadings," Journal of Econometrics, Elsevier, vol. 235(2), pages 1893-1916.
    4. Han Shang, 2014. "A survey of functional principal component analysis," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 98(2), pages 121-142, April.
    5. Yoshimasa Uematsu & Takashi Yamagata, 2022. "Estimation of Sparsity-Induced Weak Factor Models," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 41(1), pages 213-227, December.
    6. Chang, Jinyuan & Guo, Bin & Yao, Qiwei, 2018. "Principal component analysis for second-order stationary vector time series," LSE Research Online Documents on Economics 84106, London School of Economics and Political Science, LSE Library.
    7. Connor, Gregory & Korajczyk, Robert A., 1988. "Risk and return in an equilibrium APT : Application of a new test methodology," Journal of Financial Economics, Elsevier, vol. 21(2), pages 255-289, September.
    8. Jianqing Fan & Jianhua Guo & Shurong Zheng, 2022. "Estimating Number of Factors by Adjusted Eigenvalues Thresholding," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 117(538), pages 852-861, April.
    9. Peter Hall & Mohammad Hosseini‐Nasab, 2006. "On properties of functional principal components analysis," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(1), pages 109-126, February.
    10. Chang, Jinyuan & Chen, Cheng & Qiao, Xinghao & Yao, Qiwei, 2024. "An autocovariance-based learning framework for high-dimensional functional time series," Journal of Econometrics, Elsevier, vol. 239(2).
    11. Jin, Sainan & Miao, Ke & Su, Liangjun, 2021. "On factor models with random missing: EM estimation, inference, and cross validation," Journal of Econometrics, Elsevier, vol. 222(1), pages 745-777.
    12. Onatski, Alexei, 2012. "Asymptotics of the principal components estimator of large factor models with weakly influential factors," Journal of Econometrics, Elsevier, vol. 168(2), pages 244-258.
    13. Zhaoxing Gao & Ruey S. Tsay, 2022. "Modeling High-Dimensional Time Series: A Factor Model With Dynamically Dependent Factors and Diverging Eigenvalues," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 117(539), pages 1398-1414, September.
    14. Bai, Jushan & Ng, Serena, 2008. "Large Dimensional Factor Analysis," Foundations and Trends(R) in Econometrics, now publishers, vol. 3(2), pages 89-163, June.
    15. Stock J.H. & Watson M.W., 2002. "Forecasting Using Principal Components From a Large Number of Predictors," Journal of the American Statistical Association, American Statistical Association, vol. 97, pages 1167-1179, December.
    16. Wei, Jie & Chen, Hui, 2020. "Determining the number of factors in approximate factor models by twice K-fold cross validation," Economics Letters, Elsevier, vol. 191(C).
    17. Clifford Lam & Qiwei Yao & Neil Bathia, 2011. "Estimation of latent factors for high-dimensional time series," Biometrika, Biometrika Trust, vol. 98(4), pages 901-918.
    18. Lam, Clifford & Yao, Qiwei & Bathia, Neil, 2011. "Estimation of latent factors for high-dimensional time series," LSE Research Online Documents on Economics 31549, London School of Economics and Political Science, LSE Library.
    19. Jushan Bai, 2003. "Inferential Theory for Factor Models of Large Dimensions," Econometrica, Econometric Society, vol. 71(1), pages 135-171, January.
    20. Connor, Gregory & Korajczyk, Robert A., 1986. "Performance measurement with the arbitrage pricing theory : A new framework for analysis," Journal of Financial Economics, Elsevier, vol. 15(3), pages 373-394, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yoshimasa Uematsu & Takashi Yamagata, 2019. "Estimation of Weak Factor Models," ISER Discussion Paper 1053r, Institute of Social and Economic Research, The University of Osaka, revised Mar 2020.
    2. Christian Brownlees & Gu{dh}mundur Stef'an Gu{dh}mundsson & Yaping Wang, 2024. "Performance of Empirical Risk Minimization For Principal Component Regression," Papers 2409.03606, arXiv.org, revised Sep 2024.
    3. Martin Lettau & Markus Pelger & Stijn Van Nieuwerburgh, 2020. "Factors That Fit the Time Series and Cross-Section of Stock Returns," The Review of Financial Studies, Society for Financial Studies, vol. 33(5), pages 2274-2325.
    4. Barigozzi, Matteo & Trapani, Lorenzo, 2020. "Sequential testing for structural stability in approximate factor models," Stochastic Processes and their Applications, Elsevier, vol. 130(8), pages 5149-5187.
    5. Stock, J.H. & Watson, M.W., 2016. "Dynamic Factor Models, Factor-Augmented Vector Autoregressions, and Structural Vector Autoregressions in Macroeconomics," Handbook of Macroeconomics, in: J. B. Taylor & Harald Uhlig (ed.), Handbook of Macroeconomics, edition 1, volume 2, chapter 0, pages 415-525, Elsevier.
    6. Francisco Corona & Pilar Poncela & Esther Ruiz, 2020. "Estimating Non-stationary Common Factors: Implications for Risk Sharing," Computational Economics, Springer;Society for Computational Economics, vol. 55(1), pages 37-60, January.
    7. Liang Chen & Juan J. Dolado & Jesús Gonzalo, 2021. "Quantile Factor Models," Econometrica, Econometric Society, vol. 89(2), pages 875-910, March.
    8. Yuefeng Han & Rong Chen & Cun-Hui Zhang, 2020. "Rank Determination in Tensor Factor Model," Papers 2011.07131, arXiv.org, revised May 2022.
    9. Matteo Barigozzi & Marc Hallin, 2024. "The Dynamic, the Static, and the Weak Factor Models and the Analysis of High-Dimensional Time Series," Working Papers ECARES 2024-14, ULB -- Universite Libre de Bruxelles.
    10. Bai, Jushan & Ng, Serena, 2019. "Rank regularized estimation of approximate factor models," Journal of Econometrics, Elsevier, vol. 212(1), pages 78-96.
    11. Jungjun Choi & Ming Yuan, 2024. "High Dimensional Factor Analysis with Weak Factors," Papers 2402.05789, arXiv.org.
    12. Jianqing Fan & Yuan Liao & Martina Mincheva, 2013. "Large covariance estimation by thresholding principal orthogonal complements," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 75(4), pages 603-680, September.
    13. Jushan Bai & Serena Ng, 2017. "Principal Components and Regularized Estimation of Factor Models," Papers 1708.08137, arXiv.org, revised Nov 2017.
    14. Tom Boot & Bart Keijsers, 2025. "Diffusion index forecasts under weaker loadings: PCA, ridge regression, and random projections," Papers 2506.09575, arXiv.org.
    15. Catherine Doz & Domenico Giannone & Lucrezia Reichlin, 2012. "A Quasi–Maximum Likelihood Approach for Large, Approximate Dynamic Factor Models," The Review of Economics and Statistics, MIT Press, vol. 94(4), pages 1014-1024, November.
    16. Yuefeng Han & Rong Chen & Dan Yang & Cun-Hui Zhang, 2020. "Tensor Factor Model Estimation by Iterative Projection," Papers 2006.02611, arXiv.org, revised Jul 2024.
    17. Massacci, Daniele, 2017. "Least squares estimation of large dimensional threshold factor models," Journal of Econometrics, Elsevier, vol. 197(1), pages 101-129.
    18. Matteo Barigozzi & Marc Hallin, 2023. "Dynamic Factor Models: a Genealogy," Papers 2310.17278, arXiv.org, revised Jan 2024.
    19. Andreou, E. & Gagliardini, P. & Ghysels, E. & Rubin, M., 2025. "Spanning latent and observable factors," Journal of Econometrics, Elsevier, vol. 248(C).
    20. Tae-Hwy Lee & Ekaterina Seregina, 2020. "Learning from Forecast Errors: A New Approach to Forecast Combination," Working Papers 202024, University of California at Riverside, Department of Economics.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2508.15675. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.