IDEAS home Printed from https://ideas.repec.org/p/arx/papers/1708.08137.html
   My bibliography  Save this paper

Principal Components and Regularized Estimation of Factor Models

Author

Listed:
  • Jushan Bai
  • Serena Ng

Abstract

It is known that the common factors in a large panel of data can be consistently estimated by the method of principal components, and principal components can be constructed by iterative least squares regressions. Replacing least squares with ridge regressions turns out to have the effect of shrinking the singular values of the common component and possibly reducing its rank. The method is used in the machine learning literature to recover low-rank matrices. We study the procedure from the perspective of estimating a minimum-rank approximate factor model. We show that the constrained factor estimates are biased but can be more efficient in terms of mean-squared errors. Rank consideration suggests a data-dependent penalty for selecting the number of factors. The new criterion is more conservative in cases when the nominal number of factors is inflated by the presence of weak factors or large measurement noise. The framework is extended to incorporate a priori linear constraints on the loadings. We provide asymptotic results that can be used to test economic hypotheses.

Suggested Citation

  • Jushan Bai & Serena Ng, 2017. "Principal Components and Regularized Estimation of Factor Models," Papers 1708.08137, arXiv.org, revised Nov 2017.
  • Handle: RePEc:arx:papers:1708.08137
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/1708.08137
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Bai, Jushan & Wang, Peng, 2014. "Identification theory for high dimensional static and dynamic factor models," Journal of Econometrics, Elsevier, vol. 178(2), pages 794-804.
    2. Louis Guttman, 1958. "To what extent can communalities reduce rank?," Psychometrika, Springer;The Psychometric Society, vol. 23(4), pages 297-308, December.
    3. Forni, Mario & Lippi, Marco, 2001. "The Generalized Dynamic Factor Model: Representation Theory," Econometric Theory, Cambridge University Press, vol. 17(6), pages 1113-1141, December.
    4. Bai, Jushan & Ng, Serena, 2013. "Principal components estimation and identification of static factors," Journal of Econometrics, Elsevier, vol. 176(1), pages 18-29.
    5. Jianqing Fan & Yuan Liao & Martina Mincheva, 2013. "Large covariance estimation by thresholding principal orthogonal complements," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 75(4), pages 603-680, September.
    6. Gorodnichenko, Yuriy & Ng, Serena, 2017. "Level and volatility factors in macroeconomic data," Journal of Monetary Economics, Elsevier, vol. 91(C), pages 52-68.
    7. Jushan Bai & Serena Ng, 2002. "Determining the Number of Factors in Approximate Factor Models," Econometrica, Econometric Society, vol. 70(1), pages 191-221, January.
    8. P. Bentler & J. Woodward, 1980. "Inequalities among lower bounds to reliability: With applications to test construction and factor analysis," Psychometrika, Springer;The Psychometric Society, vol. 45(2), pages 249-267, June.
    9. Michael W. McCracken & Serena Ng, 2016. "FRED-MD: A Monthly Database for Macroeconomic Research," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 574-589, October.
    10. Shen, Haipeng & Huang, Jianhua Z., 2008. "Sparse principal component analysis via regularized low rank matrix approximation," Journal of Multivariate Analysis, Elsevier, vol. 99(6), pages 1015-1034, July.
    11. Chamberlain, Gary & Rothschild, Michael, 1983. "Arbitrage, Factor Structure, and Mean-Variance Analysis on Large Asset Markets," Econometrica, Econometric Society, vol. 51(5), pages 1281-1304, September.
    12. J. B. Taylor & Harald Uhlig (ed.), 2016. "Handbook of Macroeconomics," Handbook of Macroeconomics, Elsevier, edition 1, volume 2, number 2.
    13. Boivin, Jean & Ng, Serena, 2006. "Are more data always better for factor analysis?," Journal of Econometrics, Elsevier, vol. 132(1), pages 169-194, May.
    14. Alexander Shapiro, 1982. "Rank-reducibility of a symmetric matrix and sampling theory of minimum trace factor analysis," Psychometrika, Springer;The Psychometric Society, vol. 47(2), pages 187-199, June.
    15. Lettau, Martin & Pelger, Markus, 2020. "Estimating latent asset-pricing factors," Journal of Econometrics, Elsevier, vol. 218(1), pages 1-31.
    16. Mario Forni & Marc Hallin & Marco Lippi & Lucrezia Reichlin, 2000. "The Generalized Dynamic-Factor Model: Identification And Estimation," The Review of Economics and Statistics, MIT Press, vol. 82(4), pages 540-554, November.
    17. Stock, James H & Watson, Mark W, 2002. "Macroeconomic Forecasting Using Diffusion Indexes," Journal of Business & Economic Statistics, American Statistical Association, vol. 20(2), pages 147-162, April.
    18. Jos Berge & Henk Kiers, 1991. "A numerical approach to the approximate and the exact minimum rank of a covariance matrix," Psychometrika, Springer;The Psychometric Society, vol. 56(2), pages 309-315, June.
    19. Ma, Yanyuan & Genton, Marc G., 2001. "Highly Robust Estimation of Dispersion Matrices," Journal of Multivariate Analysis, Elsevier, vol. 78(1), pages 11-36, July.
    20. Bai, Jushan & Ng, Serena, 2008. "Large Dimensional Factor Analysis," Foundations and Trends(R) in Econometrics, now publishers, vol. 3(2), pages 89-163, June.
    21. Stock J.H. & Watson M.W., 2002. "Forecasting Using Principal Components From a Large Number of Predictors," Journal of the American Statistical Association, American Statistical Association, vol. 97, pages 1167-1179, December.
    22. Jushan Bai, 2003. "Inferential Theory for Factor Models of Large Dimensions," Econometrica, Econometric Society, vol. 71(1), pages 135-171, January.
    23. Jushan Bai & Serena Ng, 2006. "Confidence Intervals for Diffusion Index Forecasts and Inference for Factor-Augmented Regressions," Econometrica, Econometric Society, vol. 74(4), pages 1133-1150, July.
    24. K. Jöreskog, 1967. "Some contributions to maximum likelihood factor analysis," Psychometrika, Springer;The Psychometric Society, vol. 32(4), pages 443-482, December.
    25. Connor, Gregory & Korajczyk, Robert A., 1986. "Performance measurement with the arbitrage pricing theory : A new framework for analysis," Journal of Financial Economics, Elsevier, vol. 15(3), pages 373-394, March.
    26. Alexander Shapiro & Jos Berge, 2000. "The asymptotic bias of minimum trace factor analysis, with applications to the greatest lower bound to reliability," Psychometrika, Springer;The Psychometric Society, vol. 65(3), pages 413-425, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Alexandre Belloni & Mingli Chen & Oscar Hernan Madrid Padilla & Zixuan & Wang, 2019. "High Dimensional Latent Panel Quantile Regression with an Application to Asset Pricing," Papers 1912.02151, arXiv.org, revised Aug 2022.
    2. Lettau, Martin & Pelger, Markus, 2020. "Estimating latent asset-pricing factors," Journal of Econometrics, Elsevier, vol. 218(1), pages 1-31.
    3. Susan Athey & Mohsen Bayati & Nikolay Doudchenko & Guido Imbens & Khashayar Khosravi, 2021. "Matrix Completion Methods for Causal Panel Data Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(536), pages 1716-1730, October.
    4. Jushan Bai & Sung Hoon Choi & Yuan Liao, 2021. "Feasible generalized least squares for panel data with cross-sectional and serial correlations," Empirical Economics, Springer, vol. 60(1), pages 309-326, January.
    5. Yoshimasa Uematsu & Takashi Yamagata, 2019. "Estimation of Weak Factor Models," DSSR Discussion Papers 96, Graduate School of Economics and Management, Tohoku University.
    6. Jiangtao Duan & Wei Gao & Hao Qu & Hon Keung Tony, 2019. "Subspace Clustering for Panel Data with Interactive Effects," Papers 1909.09928, arXiv.org, revised Feb 2021.
    7. Victor Chernozhukov & Christian Hansen & Yuan Liao & Yinchu Zhu, 2019. "Inference for heterogeneous effects using low-rank estimations," CeMMAP working papers CWP31/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    8. Gu, Shihao & Kelly, Bryan & Xiu, Dacheng, 2021. "Autoencoder asset pricing models," Journal of Econometrics, Elsevier, vol. 222(1), pages 429-450.
    9. Philippe Goulet Coulombe, 2020. "Time-Varying Parameters as Ridge Regressions," Papers 2009.00401, arXiv.org, revised Apr 2023.
    10. Jianqing Fan & Kunpeng Li & Yuan Liao, 2020. "Recent Developments on Factor Models and its Applications in Econometric Learning," Papers 2009.10103, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bai, Jushan & Ng, Serena, 2019. "Rank regularized estimation of approximate factor models," Journal of Econometrics, Elsevier, vol. 212(1), pages 78-96.
    2. Stock, J.H. & Watson, M.W., 2016. "Dynamic Factor Models, Factor-Augmented Vector Autoregressions, and Structural Vector Autoregressions in Macroeconomics," Handbook of Macroeconomics, in: J. B. Taylor & Harald Uhlig (ed.), Handbook of Macroeconomics, edition 1, volume 2, chapter 0, pages 415-525, Elsevier.
    3. Jushan Bai & Serena Ng, 2020. "Simpler Proofs for Approximate Factor Models of Large Dimensions," Papers 2008.00254, arXiv.org.
    4. Liang Chen & Juan J. Dolado & Jesús Gonzalo, 2021. "Quantile Factor Models," Econometrica, Econometric Society, vol. 89(2), pages 875-910, March.
    5. Kutateladze, Varlam, 2022. "The kernel trick for nonlinear factor modeling," International Journal of Forecasting, Elsevier, vol. 38(1), pages 165-177.
    6. Varlam Kutateladze, 2021. "The Kernel Trick for Nonlinear Factor Modeling," Papers 2103.01266, arXiv.org.
    7. Jianqing Fan & Yuan Liao & Martina Mincheva, 2013. "Large covariance estimation by thresholding principal orthogonal complements," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 75(4), pages 603-680, September.
    8. Smeekes, Stephan & Wijler, Etienne, 2018. "Macroeconomic forecasting using penalized regression methods," International Journal of Forecasting, Elsevier, vol. 34(3), pages 408-430.
    9. Tomohiro Ando & Ruey S. Tsay, 2009. "Model selection for generalized linear models with factor‐augmented predictors," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 25(3), pages 207-235, May.
    10. Pilar Poncela & Esther Ruiz, 2016. "Small- Versus Big-Data Factor Extraction in Dynamic Factor Models: An Empirical Assessment," Advances in Econometrics, in: Dynamic Factor Models, volume 35, pages 401-434, Emerald Group Publishing Limited.
    11. Fan, Jianqing & Ke, Yuan & Liao, Yuan, 2021. "Augmented factor models with applications to validating market risk factors and forecasting bond risk premia," Journal of Econometrics, Elsevier, vol. 222(1), pages 269-294.
    12. Bai, Jushan & Ng, Serena, 2023. "Approximate factor models with weaker loadings," Journal of Econometrics, Elsevier, vol. 235(2), pages 1893-1916.
    13. Poncela, Pilar & Ruiz, Esther & Miranda, Karen, 2021. "Factor extraction using Kalman filter and smoothing: This is not just another survey," International Journal of Forecasting, Elsevier, vol. 37(4), pages 1399-1425.
    14. Groen, Jan J.J. & Kapetanios, George, 2016. "Revisiting useful approaches to data-rich macroeconomic forecasting," Computational Statistics & Data Analysis, Elsevier, vol. 100(C), pages 221-239.
    15. Catherine Doz & Peter Fuleky, 2019. "Dynamic Factor Models," PSE Working Papers halshs-02262202, HAL.
    16. Yoshimasa Uematsu & Takashi Yamagata, 2019. "Estimation of Weak Factor Models," DSSR Discussion Papers 96, Graduate School of Economics and Management, Tohoku University.
    17. Catherine Doz & Peter Fuleky, 2019. "Dynamic Factor Models," Working Papers halshs-02262202, HAL.
    18. Giovannelli, Alessandro & Massacci, Daniele & Soccorsi, Stefano, 2021. "Forecasting stock returns with large dimensional factor models," Journal of Empirical Finance, Elsevier, vol. 63(C), pages 252-269.
    19. Cheng, Xu & Hansen, Bruce E., 2015. "Forecasting with factor-augmented regression: A frequentist model averaging approach," Journal of Econometrics, Elsevier, vol. 186(2), pages 280-293.
    20. Yoshimasa Uematsu & Takashi Yamagata, 2020. "Inference in Weak Factor Models," ISER Discussion Paper 1080, Institute of Social and Economic Research, Osaka University.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1708.08137. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.