IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2301.00292.html
   My bibliography  Save this paper

Inference for Large Panel Data with Many Covariates

Author

Listed:
  • Markus Pelger
  • Jiacheng Zou

Abstract

This paper proposes a novel testing procedure for selecting a sparse set of covariates that explains a large dimensional panel. Our selection method provides correct false detection control while having higher power than existing approaches. We develop the inferential theory for large panels with many covariates by combining post-selection inference with a novel multiple testing adjustment. Our data-driven hypotheses are conditional on the sparse covariate selection. We control for family-wise error rates for covariate discovery for large cross-sections. As an easy-to-use and practically relevant procedure, we propose Panel-PoSI, which combines the data-driven adjustment for panel multiple testing with valid post-selection p-values of a generalized LASSO, that allows us to incorporate priors. In an empirical study, we select a small number of asset pricing factors that explain a large cross-section of investment strategies. Our method dominates the benchmarks out-of-sample due to its better size and power.

Suggested Citation

  • Markus Pelger & Jiacheng Zou, 2022. "Inference for Large Panel Data with Many Covariates," Papers 2301.00292, arXiv.org, revised Mar 2023.
  • Handle: RePEc:arx:papers:2301.00292
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2301.00292
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Seung C. Ahn & Alex R. Horenstein, 2013. "Eigenvalue Ratio Test for the Number of Factors," Econometrica, Econometric Society, vol. 81(3), pages 1203-1227, May.
    2. Xiaoying Tian & Joshua R Loftus & Jonathan E Taylor, 2018. "Selective inference with unknown variance via the square-root lasso," Biometrika, Biometrika Trust, vol. 105(4), pages 755-768.
    3. Kapetanios, George, 2010. "A Testing Procedure for Determining the Number of Factors in Approximate Factor Models With Large Datasets," Journal of Business & Economic Statistics, American Statistical Association, vol. 28(3), pages 397-409.
    4. Fama, Eugene F. & French, Kenneth R., 2015. "A five-factor asset pricing model," Journal of Financial Economics, Elsevier, vol. 116(1), pages 1-22.
    5. Max Grazier G'Sell & Stefan Wager & Alexandra Chouldechova & Robert Tibshirani, 2016. "Sequential selection procedures and false discovery rate control," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(2), pages 423-444, March.
    6. Alexei Onatski, 2010. "Determining the Number of Factors from Empirical Distribution of Eigenvalues," The Review of Economics and Statistics, MIT Press, vol. 92(4), pages 1004-1016, November.
    7. Victor Chernozhukov & Christian Hansen & Martin Spindler, 2015. "Valid Post-Selection and Post-Regularization Inference: An Elementary, General Approach," Annual Review of Economics, Annual Reviews, vol. 7(1), pages 649-688, August.
    8. Xiaoying Tian & Jonathan Taylor, 2017. "Asymptotics of Selective Inference," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 44(2), pages 480-499, June.
    9. Pelger, Markus, 2019. "Large-dimensional factor modeling based on high-frequency observations," Journal of Econometrics, Elsevier, vol. 208(1), pages 23-42.
    10. Ryan J. Tibshirani & Jonathan Taylor & Richard Lockhart & Robert Tibshirani, 2016. "Exact Post-Selection Inference for Sequential Regression Procedures," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(514), pages 600-620, April.
    11. Cun-Hui Zhang & Stephanie S. Zhang, 2014. "Confidence intervals for low dimensional parameters in high dimensional linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 217-242, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yoshimasa Uematsu & Takashi Yamagata, 2019. "Estimation of Weak Factor Models," DSSR Discussion Papers 96, Graduate School of Economics and Management, Tohoku University.
    2. Alain-Philippe Fortin & Patrick Gagliardini & O. Scaillet, 2022. "Eigenvalue tests for the number of latent factors in short panels," Swiss Finance Institute Research Paper Series 22-81, Swiss Finance Institute.
    3. Gagliardini, Patrick & Ossola, Elisa & Scaillet, Olivier, 2019. "A diagnostic criterion for approximate factor structure," Journal of Econometrics, Elsevier, vol. 212(2), pages 503-521.
    4. Alain-Philippe Fortin & Patrick Gagliardini & Olivier Scaillet, 2023. "Latent Factor Analysis in Short Panels," Papers 2306.14004, arXiv.org.
    5. Gagliardini, Patrick & Ossola, Elisa & Scaillet, Olivier, 2019. "Estimation of large dimensional conditional factor models in finance," Working Papers unige:125031, University of Geneva, Geneva School of Economics and Management.
    6. Choi, In & Lin, Rui & Shin, Yongcheol, 2023. "Canonical correlation-based model selection for the multilevel factors," Journal of Econometrics, Elsevier, vol. 233(1), pages 22-44.
    7. Jianqing Fan & Kunpeng Li & Yuan Liao, 2020. "Recent Developments on Factor Models and its Applications in Econometric Learning," Papers 2009.10103, arXiv.org.
    8. GUO-FITOUSSI, Liang, 2013. "A Comparison of the Finite Sample Properties of Selection Rules of Factor Numbers in Large Datasets," MPRA Paper 50005, University Library of Munich, Germany.
    9. Wei, Jie & Chen, Hui, 2020. "Determining the number of factors in approximate factor models by twice K-fold cross validation," Economics Letters, Elsevier, vol. 191(C).
    10. Yongfu Huang & Muhammad G. Quibria, 2015. "The global partnership for sustainable development," Natural Resources Forum, Blackwell Publishing, vol. 0(3-4), pages 157-174, August.
    11. Guo, Xu & Li, Runze & Liu, Jingyuan & Zeng, Mudong, 2023. "Statistical inference for linear mediation models with high-dimensional mediators and application to studying stock reaction to COVID-19 pandemic," Journal of Econometrics, Elsevier, vol. 235(1), pages 166-179.
    12. Yunus Emre Ergemen & Carlos Vladimir Rodríguez-Caballero, 2016. "A Dynamic Multi-Level Factor Model with Long-Range Dependence," CREATES Research Papers 2016-23, Department of Economics and Business Economics, Aarhus University.
    13. Gu, Shihao & Kelly, Bryan & Xiu, Dacheng, 2021. "Autoencoder asset pricing models," Journal of Econometrics, Elsevier, vol. 222(1), pages 429-450.
    14. Ergemen, Yunus Emre & Rodríguez-Caballero, C. Vladimir, 2023. "Estimation of a dynamic multi-level factor model with possible long-range dependence," International Journal of Forecasting, Elsevier, vol. 39(1), pages 405-430.
    15. Simon Freyaldenhoven, 2017. "A Generalized Factor Model with Local Factors," 2017 Papers pfr361, Job Market Papers.
    16. Adamek, Robert & Smeekes, Stephan & Wilms, Ines, 2023. "Lasso inference for high-dimensional time series," Journal of Econometrics, Elsevier, vol. 235(2), pages 1114-1143.
    17. Alexander Chudik & M. Hashem Pesaran, 2013. "Large panel data models with cross-sectional dependence: a survey," Globalization Institute Working Papers 153, Federal Reserve Bank of Dallas.
    18. Matteo Barigozzi & Marco Lippi & Matteo Luciani, 2014. "Dynamic Factor Models, Cointegration and Error Correction Mechanisms," Working Papers ECARES ECARES 2014-14, ULB -- Universite Libre de Bruxelles.
    19. Francisco Corona & Pilar Poncela & Esther Ruiz, 2017. "Determining the number of factors after stationary univariate transformations," Empirical Economics, Springer, vol. 53(1), pages 351-372, August.
    20. Dovonon, Prosper & Taamouti, Abderrahim & Williams, Julian, 2022. "Testing the eigenvalue structure of spot and integrated covariance," Journal of Econometrics, Elsevier, vol. 229(2), pages 363-395.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2301.00292. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.