IDEAS home Printed from https://ideas.repec.org/a/taf/jnlasa/v116y2021i536p1746-1763.html
   My bibliography  Save this article

Matrix Completion, Counterfactuals, and Factor Analysis of Missing Data

Author

Listed:
  • Jushan Bai
  • Serena Ng

Abstract

This article proposes an imputation procedure that uses the factors estimated from a tall block along with the re-rotated loadings estimated from a wide block to impute missing values in a panel of data. Assuming that a strong factor structure holds for the full panel of data and its sub-blocks, it is shown that the common component can be consistently estimated at four different rates of convergence without requiring regularization or iteration. An asymptotic analysis of the estimation error is obtained. An application of our analysis is estimation of counterfactuals when potential outcomes have a factor structure. We study the estimation of average and individual treatment effects on the treated and establish a normal distribution theory that can be useful for hypothesis testing.

Suggested Citation

  • Jushan Bai & Serena Ng, 2021. "Matrix Completion, Counterfactuals, and Factor Analysis of Missing Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(536), pages 1746-1763, October.
  • Handle: RePEc:taf:jnlasa:v:116:y:2021:i:536:p:1746-1763
    DOI: 10.1080/01621459.2021.1967163
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1080/01621459.2021.1967163
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1080/01621459.2021.1967163?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to look for a different version below or search for a different version of it.

    Other versions of this item:

    References listed on IDEAS

    as
    1. Giannone, Domenico & Reichlin, Lucrezia & Small, David, 2008. "Nowcasting: The real-time informational content of macroeconomic data," Journal of Monetary Economics, Elsevier, vol. 55(4), pages 665-676, May.
    2. Jushan Bai & Serena Ng, 2002. "Determining the Number of Factors in Approximate Factor Models," Econometrica, Econometric Society, vol. 70(1), pages 191-221, January.
    3. Horton, Nicholas J. & Kleinman, Ken P., 2007. "Much Ado About Nothing: A Comparison of Missing Data Methods and Software to Fit Incomplete Data Regression Models," The American Statistician, American Statistical Association, vol. 61, pages 79-90, February.
    4. R. H. Shumway & D. S. Stoffer, 1982. "An Approach To Time Series Smoothing And Forecasting Using The Em Algorithm," Journal of Time Series Analysis, Wiley Blackwell, vol. 3(4), pages 253-264, July.
    5. Xu, Yiqing, 2017. "Generalized Synthetic Control Method: Causal Inference with Interactive Fixed Effects Models," Political Analysis, Cambridge University Press, vol. 25(1), pages 57-76, January.
    6. Laurent Gobillon & Thierry Magnac, 2016. "Regional Policy Evaluation: Interactive Fixed Effects and Synthetic Controls," The Review of Economics and Statistics, MIT Press, vol. 98(3), pages 535-551, July.
    7. Domenico Giannone & Lucrezia Reichlin & David H. Small, 2005. "Nowcasting GDP and inflation: the real-time informational content of macroeconomic data releases," Finance and Economics Discussion Series 2005-42, Board of Governors of the Federal Reserve System (U.S.).
    8. James Honaker & Gary King, 2010. "What to Do about Missing Values in Time‐Series Cross‐Section Data," American Journal of Political Science, John Wiley & Sons, vol. 54(2), pages 561-581, April.
    9. J. B. Taylor & Harald Uhlig (ed.), 2016. "Handbook of Macroeconomics," Handbook of Macroeconomics, Elsevier, edition 1, volume 2, number 2.
    10. Marta Bańbura & Michele Modugno, 2014. "Maximum Likelihood Estimation Of Factor Models On Datasets With Arbitrary Pattern Of Missing Data," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 29(1), pages 133-160, January.
    11. Carl Eckart & Gale Young, 1936. "The approximation of one matrix by another of lower rank," Psychometrika, Springer;The Psychometric Society, vol. 1(3), pages 211-218, September.
    12. Jushan Bai, 2009. "Panel Data Models With Interactive Fixed Effects," Econometrica, Econometric Society, vol. 77(4), pages 1229-1279, July.
    13. Cheng Hsiao & H. Steve Ching & Shui Ki Wan, 2012. "A Panel Data Approach For Program Evaluation: Measuring The Benefits Of Political And Economic Integration Of Hong Kong With Mainland China," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 27(5), pages 705-740, August.
    14. Jin, Sainan & Miao, Ke & Su, Liangjun, 2021. "On factor models with random missing: EM estimation, inference, and cross validation," Journal of Econometrics, Elsevier, vol. 222(1), pages 745-777.
    15. Domenico Giannone & Lucrezia Reichlin & David Small, 2008. "Nowcasting: the real time informational content of macroeconomic data releases," ULB Institutional Repository 2013/6409, ULB -- Universite Libre de Bruxelles.
    16. Jungbacker, B. & Koopman, S.J. & van der Wel, M., 2011. "Maximum likelihood estimation for dynamic factor models with missing data," Journal of Economic Dynamics and Control, Elsevier, vol. 35(8), pages 1358-1368, August.
    17. Jushan Bai, 2003. "Inferential Theory for Factor Models of Large Dimensions," Econometrica, Econometric Society, vol. 71(1), pages 135-171, January.
    18. Bai, Jushan & Ng, Serena, 2019. "Rank regularized estimation of approximate factor models," Journal of Econometrics, Elsevier, vol. 212(1), pages 78-96.
    19. Stock J.H. & Watson M.W., 2002. "Forecasting Using Principal Components From a Large Number of Predictors," Journal of the American Statistical Association, American Statistical Association, vol. 97, pages 1167-1179, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Cahan, Ercument & Bai, Jushan & Ng, Serena, 2023. "Factor-based imputation of missing values and covariances in panel data of large dimensions," Journal of Econometrics, Elsevier, vol. 233(1), pages 113-131.
    2. Callaway, Brantly & Karami, Sonia, 2023. "Treatment effects in interactive fixed effects models with a small number of time periods," Journal of Econometrics, Elsevier, vol. 233(1), pages 184-208.
    3. Alexandre Belloni & Mingli Chen & Oscar Hernan Madrid Padilla & Zixuan & Wang, 2019. "High Dimensional Latent Panel Quantile Regression with an Application to Asset Pricing," Papers 1912.02151, arXiv.org, revised Aug 2022.
    4. Michael W. McCracken & Serena Ng, 2021. "FRED-QD: A Quarterly Database for Macroeconomic Research," Review, Federal Reserve Bank of St. Louis, vol. 103(1), pages 1-44, January.
    5. Anish Agarwal & Keegan Harris & Justin Whitehouse & Zhiwei Steven Wu, 2023. "Adaptive Principal Component Regression with Applications to Panel Data," Papers 2307.01357, arXiv.org, revised Oct 2023.
    6. Guido Imbens & Nathan Kallus & Xiaojie Mao, 2021. "Controlling for Unmeasured Confounding in Panel Data Using Minimal Bridge Functions: From Two-Way Fixed Effects to Factor Models," Papers 2108.03849, arXiv.org.
    7. Ruoxuan Xiong & Markus Pelger, 2019. "Large Dimensional Latent Factor Modeling with Missing Observations and Applications to Causal Inference," Papers 1910.08273, arXiv.org, revised Jan 2022.
    8. Joaqui-Barandica, Orlando & Manotas-Duque, Diego F. & Uribe, Jorge M., 2022. "Commonality, macroeconomic factors and banking profitability," The North American Journal of Economics and Finance, Elsevier, vol. 62(C).
    9. Magnac, Thierry, 2023. "Capital humain et recherche d'emploi: un mariage heureux - Human Capital and Search Models: A Happy Match," TSE Working Papers 23-1489, Toulouse School of Economics (TSE).
    10. FATUM, Rasmus & YAMAMOTO, Yohei & CHEN, Binwei, 2023. "The Trend Effect of Foreign Exchange Intervention," Discussion paper series HIAS-E-132, Hitotsubashi Institute for Advanced Study, Hitotsubashi University.
    11. Juho Koistinen & Bernd Funovits, 2022. "Estimation of Impulse-Response Functions with Dynamic Factor Models: A New Parametrization," Papers 2202.00310, arXiv.org, revised Feb 2022.
    12. Alexandre Bonnet R. Costa & Pedro Cavalcanti G. Ferreira & Wagner Piazza Gaglianone & Osmani Teixeira C. Guillén & João Victor Issler & Artur Brasil Fialho Rodrigues, 2023. "Predicting Recessions in (almost) Real Time in a Big-data Setting," Working Papers Series 587, Central Bank of Brazil, Research Department.
    13. Yinchu Zhu, 2019. "How well can we learn large factor models without assuming strong factors?," Papers 1910.10382, arXiv.org, revised Nov 2019.
    14. Chan, Joshua C.C. & Poon, Aubrey & Zhu, Dan, 2023. "High-dimensional conditionally Gaussian state space models with missing data," Journal of Econometrics, Elsevier, vol. 236(1).
    15. Jungjun Choi & Hyukjun Kwon & Yuan Liao, 2023. "Inference for Low-rank Completion without Sample Splitting with Application to Treatment Effect Estimation," Papers 2307.16370, arXiv.org.
    16. Serena Ng & Susannah Scanlan, 2023. "Constructing High Frequency Economic Indicators by Imputation," Papers 2303.01863, arXiv.org, revised Oct 2023.
    17. Alberto Abadie & Anish Agarwal & Raaz Dwivedi & Abhin Shah, 2024. "Doubly Robust Inference in Causal Latent Factor Models," Papers 2402.11652, arXiv.org.
    18. Luis Costa & Vivek F. Farias & Patricio Foncea & Jingyuan (Donna) Gan & Ayush Garg & Ivo Rosa Montenegro & Kumarjit Pathak & Tianyi Peng & Dusan Popovic, 2023. "Generalized Synthetic Control for TestOps at ABI: Models, Algorithms, and Infrastructure," Interfaces, INFORMS, vol. 53(5), pages 336-349, September.
    19. Zongwu Cai & Ying Fang & Ming Lin & Zixuan Wu, 2023. "A Quasi Synthetic Control Method for Nonlinear Models With High-Dimensional Covariates," WORKING PAPERS SERIES IN THEORETICAL AND APPLIED ECONOMICS 202305, University of Kansas, Department of Economics, revised Aug 2023.
    20. Albert Chiu & Xingchen Lan & Ziyi Liu & Yiqing Xu, 2023. "What To Do (and Not to Do) with Causal Panel Analysis under Parallel Trends: Lessons from A Large Reanalysis Study," Papers 2309.15983, arXiv.org.
    21. Xingyu Li & Yan Shen & Qiankun Zhou, 2022. "Confidence Intervals of Treatment Effects in Panel Data Models with Interactive Fixed Effects," Papers 2202.12078, arXiv.org.
    22. Jungjun Choi & Ming Yuan, 2023. "Matrix Completion When Missing Is Not at Random and Its Applications in Causal Panel Data Models," Papers 2308.02364, arXiv.org.
    23. Xiong, Ruoxuan & Pelger, Markus, 2023. "Large dimensional latent factor modeling with missing observations and applications to causal inference," Journal of Econometrics, Elsevier, vol. 233(1), pages 271-301.
    24. Brantly Callaway, 2022. "Difference-in-Differences for Policy Evaluation," Papers 2203.15646, arXiv.org.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ruoxuan Xiong & Markus Pelger, 2019. "Large Dimensional Latent Factor Modeling with Missing Observations and Applications to Causal Inference," Papers 1910.08273, arXiv.org, revised Jan 2022.
    2. Cahan, Ercument & Bai, Jushan & Ng, Serena, 2023. "Factor-based imputation of missing values and covariances in panel data of large dimensions," Journal of Econometrics, Elsevier, vol. 233(1), pages 113-131.
    3. Poncela, Pilar & Ruiz, Esther & Miranda, Karen, 2021. "Factor extraction using Kalman filter and smoothing: This is not just another survey," International Journal of Forecasting, Elsevier, vol. 37(4), pages 1399-1425.
    4. Matteo Barigozzi & Matteo Luciani, 2019. "Quasi Maximum Likelihood Estimation and Inference of Large Approximate Dynamic Factor Models via the EM algorithm," Papers 1910.03821, arXiv.org, revised Feb 2022.
    5. Xiong, Ruoxuan & Pelger, Markus, 2023. "Large dimensional latent factor modeling with missing observations and applications to causal inference," Journal of Econometrics, Elsevier, vol. 233(1), pages 271-301.
    6. Stock, J.H. & Watson, M.W., 2016. "Dynamic Factor Models, Factor-Augmented Vector Autoregressions, and Structural Vector Autoregressions in Macroeconomics," Handbook of Macroeconomics, in: J. B. Taylor & Harald Uhlig (ed.), Handbook of Macroeconomics, edition 1, volume 2, chapter 0, pages 415-525, Elsevier.
    7. Catherine Doz & Peter Fuleky, 2019. "Dynamic Factor Models," Working Papers 2019-4, University of Hawaii Economic Research Organization, University of Hawaii at Manoa.
    8. Catherine Doz & Peter Fuleky, 2019. "Dynamic Factor Models," PSE Working Papers halshs-02262202, HAL.
    9. Catherine Doz & Peter Fuleky, 2019. "Dynamic Factor Models," Working Papers halshs-02262202, HAL.
    10. Kaufmann, Daniel & Scheufele, Rolf, 2017. "Business tendency surveys and macroeconomic fluctuations," International Journal of Forecasting, Elsevier, vol. 33(4), pages 878-893.
    11. Jin, Sainan & Miao, Ke & Su, Liangjun, 2021. "On factor models with random missing: EM estimation, inference, and cross validation," Journal of Econometrics, Elsevier, vol. 222(1), pages 745-777.
    12. Pilar Poncela & Esther Ruiz, 2016. "Small- Versus Big-Data Factor Extraction in Dynamic Factor Models: An Empirical Assessment," Advances in Econometrics, in: Dynamic Factor Models, volume 35, pages 401-434, Emerald Group Publishing Limited.
    13. Marcellino, Massimiliano & Sivec, Vasja, 2016. "Monetary, fiscal and oil shocks: Evidence based on mixed frequency structural FAVARs," Journal of Econometrics, Elsevier, vol. 193(2), pages 335-348.
    14. Monica Defend & Aleksey Min & Lorenzo Portelli & Franz Ramsauer & Francesco Sandrini & Rudi Zagst, 2021. "Quantifying Drivers of Forecasted Returns Using Approximate Dynamic Factor Models for Mixed-Frequency Panel Data," Forecasting, MDPI, vol. 3(1), pages 1-35, February.
    15. Matteo Barigozzi & Matteo Luciani, 2017. "Common Factors, Trends, and Cycles in Large Datasets," Finance and Economics Discussion Series 2017-111, Board of Governors of the Federal Reserve System (U.S.).
    16. Ma, Tao & Zhou, Zhou & Antoniou, Constantinos, 2018. "Dynamic factor model for network traffic state forecast," Transportation Research Part B: Methodological, Elsevier, vol. 118(C), pages 281-317.
    17. Jonas Krampe & Luca Margaritella, 2021. "Factor Models with Sparse VAR Idiosyncratic Components," Papers 2112.07149, arXiv.org, revised May 2022.
    18. Hindrayanto, Irma & Koopman, Siem Jan & de Winter, Jasper, 2016. "Forecasting and nowcasting economic growth in the euro area using factor models," International Journal of Forecasting, Elsevier, vol. 32(4), pages 1284-1305.
    19. Alvarez, Rocio & Camacho, Maximo & Perez-Quiros, Gabriel, 2016. "Aggregate versus disaggregate information in dynamic factor models," International Journal of Forecasting, Elsevier, vol. 32(3), pages 680-694.
    20. Modugno, Michele & Soybilgen, Barış & Yazgan, Ege, 2016. "Nowcasting Turkish GDP and news decomposition," International Journal of Forecasting, Elsevier, vol. 32(4), pages 1369-1384.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:taf:jnlasa:v:116:y:2021:i:536:p:1746-1763. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Longhurst (email available below). General contact details of provider: http://www.tandfonline.com/UASA20 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.