IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2512.22697.html

Canonical correlation regression with noisy data

Author

Listed:
  • Isaac Meza
  • Rahul Singh

Abstract

We study instrumental variable regression in data rich environments. The goal is to estimate a linear model from many noisy covariates and many noisy instruments. Our key assumption is that true covariates and true instruments are repetitive, though possibly different in nature; they each reflect a few underlying factors, however those underlying factors may be misaligned. We analyze a family of estimators based on two stage least squares with spectral regularization: canonical correlations between covariates and instruments are learned in the first stage, which are used as regressors in the second stage. As a theoretical contribution, we derive upper and lower bounds on estimation error, proving optimality of the method with noisy data. As a practical contribution, we provide guidance on which types of spectral regularization to use in different regimes.

Suggested Citation

  • Isaac Meza & Rahul Singh, 2025. "Canonical correlation regression with noisy data," Papers 2512.22697, arXiv.org.
  • Handle: RePEc:arx:papers:2512.22697
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2512.22697
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Anna Mikusheva & Liyang Sun, 2022. "Inference with Many Weak Instruments," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 89(5), pages 2663-2686.
    2. Bai, Jushan & Ng, Serena, 2010. "Instrumental Variable Estimation In A Data Rich Environment," Econometric Theory, Cambridge University Press, vol. 26(6), pages 1577-1606, December.
    3. Douglas Staiger & James H. Stock, 1997. "Instrumental Variables Regression with Weak Instruments," Econometrica, Econometric Society, vol. 65(3), pages 557-586, May.
    4. Isaiah Andrews, 2018. "Valid Two-Step Identification-Robust Confidence Sets for GMM," The Review of Economics and Statistics, MIT Press, vol. 100(2), pages 337-348, May.
    5. Anna Bykhovskaya & Vadim Gorin, 2023. "High-Dimensional Canonical Correlation Analysis," Papers 2306.16393, arXiv.org, revised Jan 2025.
    6. S. Darolles & Y. Fan & J. P. Florens & E. Renault, 2011. "Nonparametric Instrumental Regression," Econometrica, Econometric Society, vol. 79(5), pages 1541-1565, September.
    7. Carrasco, Marine, 2012. "A regularization approach to the many instruments problem," Journal of Econometrics, Elsevier, vol. 170(2), pages 383-398.
    8. Carrasco, Marine & Florens, Jean-Pierre & Renault, Eric, 2007. "Linear Inverse Problems in Structural Econometrics Estimation Based on Spectral Decomposition and Regularization," Handbook of Econometrics, in: J.J. Heckman & E.E. Leamer (ed.), Handbook of Econometrics, edition 1, volume 6, chapter 77, Elsevier.
    9. Anna Bykhovskaya & Vadim Gorin, 2024. "Canonical Correlation Analysis: review," Papers 2411.15625, arXiv.org, revised Nov 2025.
    10. Lim, Dennis & Wang, Wenjie & Zhang, Yichong, 2024. "A conditional linear combination test with many weak instruments," Journal of Econometrics, Elsevier, vol. 238(2).
    11. Isaiah Andrews, 2016. "Conditional Linear Combination Tests for Weakly Identified Models," Econometrica, Econometric Society, vol. 84, pages 2155-2182, November.
    12. Eric Gautier & Alexandre Tsybakov, 2011. "High-Dimensional Instrumental Variables Regression and Confidence Sets," Working Papers 2011-13, Center for Research in Economics and Statistics.
    13. Marine Carrasco & Guy Tchuente, 2016. "Regularization Based Anderson Rubin Tests for Many Instruments," Studies in Economics 1608, School of Economics, University of Kent.
    14. Stock J.H. & Watson M.W., 2002. "Forecasting Using Principal Components From a Large Number of Predictors," Journal of the American Statistical Association, American Statistical Association, vol. 97, pages 1167-1179, December.
    15. Benaych-Georges, Florent & Nadakuditi, Raj Rao, 2012. "The singular values and vectors of low rank perturbations of large rectangular random matrices," Journal of Multivariate Analysis, Elsevier, vol. 111(C), pages 120-135.
    16. Jushan Bai & Peng Wang, 2016. "Econometric Analysis of Large Factor Models," Annual Review of Economics, Annual Reviews, vol. 8(1), pages 53-80, October.
    17. Xiaohong Chen & Timothy M. Christensen, 2018. "Optimal sup‐norm rates and uniform inference on nonlinear functionals of nonparametric IV regression," Quantitative Economics, Econometric Society, vol. 9(1), pages 39-84, March.
    18. Jushan Bai & Serena Ng, 2006. "Confidence Intervals for Diffusion Index Forecasts and Inference for Factor-Augmented Regressions," Econometrica, Econometric Society, vol. 74(4), pages 1133-1150, July.
    19. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "High-Dimensional Methods and Inference on Structural and Treatment Effects," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 29-50, Spring.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Dennis Lim & Wenjie Wang & Yichong Zhang, 2024. "A Dimension-Agnostic Bootstrap Anderson-Rubin Test For Instrumental Variable Regressions," Papers 2412.01603, arXiv.org, revised Sep 2025.
    2. Marine Carrasco & Guy Tchuente, 2016. "Efficient Estimation with Many Weak Instruments Using Regularization Techniques," Econometric Reviews, Taylor & Francis Journals, vol. 35(8-10), pages 1609-1637, December.
    3. Wang, Wenjie & Zhang, Yichong, 2024. "Wild bootstrap inference for instrumental variables regressions with weak and few clusters," Journal of Econometrics, Elsevier, vol. 241(1).
    4. Dennis Lim & Wenjie Wang & Yichong Zhang, 2022. "A Conditional Linear Combination Test with Many Weak Instruments," Papers 2207.11137, arXiv.org, revised Apr 2023.
    5. Anna Mikusheva & Liyang Sun, 2024. "Weak identification with many instruments," The Econometrics Journal, Royal Economic Society, vol. 27(2), pages -28.
    6. Guy Tchuente, 2019. "Weak Identification and Estimation of Social Interaction Models," Papers 1902.06143, arXiv.org.
    7. Hansen, Christian & Kozbur, Damian, 2014. "Instrumental variables estimation with many weak instruments using regularized JIVE," Journal of Econometrics, Elsevier, vol. 182(2), pages 290-308.
    8. Philipp Gersing & Matteo Barigozzi & Christoph Rust & Manfred Deistler, 2023. "The Canonical Decomposition of Factor Models: Weak Factors are Everywhere," Papers 2307.10067, arXiv.org, revised Feb 2025.
    9. Wenze Li, 2025. "An Empirical Comparison of Weak-IV-Robust Procedures in Just-Identified Models," Papers 2506.18001, arXiv.org.
    10. Jianqing Fan & Kunpeng Li & Yuan Liao, 2020. "Recent Developments on Factor Models and its Applications in Econometric Learning," Papers 2009.10103, arXiv.org.
    11. Lim, Dennis & Wang, Wenjie & Zhang, Yichong, 2024. "A conditional linear combination test with many weak instruments," Journal of Econometrics, Elsevier, vol. 238(2).
    12. Antoine, Bertille & Lavergne, Pascal, 2023. "Identification-robust nonparametric inference in a linear IV model," Journal of Econometrics, Elsevier, vol. 235(1), pages 1-24.
    13. Yuan Liao & Xiye Yang, 2017. "Uniform Inference for Conditional Factor Models with Instrumental and Idiosyncratic Betas," Departmental Working Papers 201711, Rutgers University, Department of Economics.
    14. Hansen, Christian & Liao, Yuan, 2019. "The Factor-Lasso And K-Step Bootstrap Approach For Inference In High-Dimensional Economic Applications," Econometric Theory, Cambridge University Press, vol. 35(3), pages 465-509, June.
    15. Liyu Dou & Pengjin Min & Wenjie Wang & Yichong Zhang, 2025. "An Improved Inference for IV Regressions," Papers 2506.23816, arXiv.org, revised Mar 2026.
    16. Wang, Wenjie, 2021. "Wild Bootstrap for Instrumental Variables Regression with Weak Instruments and Few Clusters," MPRA Paper 106227, University Library of Munich, Germany.
    17. Bai, Jushan & Ng, Serena, 2013. "Principal components estimation and identification of static factors," Journal of Econometrics, Elsevier, vol. 176(1), pages 18-29.
    18. Stock, J.H. & Watson, M.W., 2016. "Dynamic Factor Models, Factor-Augmented Vector Autoregressions, and Structural Vector Autoregressions in Macroeconomics," Handbook of Macroeconomics, in: J. B. Taylor & Harald Uhlig (ed.), Handbook of Macroeconomics, edition 1, volume 2, chapter 0, pages 415-525, Elsevier.
    19. Asongu, Simplice A. & Andrés, Antonio R., 2020. "Trajectories of knowledge economy in SSA and MENA countries," Technology in Society, Elsevier, vol. 63(C).
    20. Nandana Sengupta & Fallaw Sowell, 2020. "On the Asymptotic Distribution of Ridge Regression Estimators Using Training and Test Samples," Econometrics, MDPI, vol. 8(4), pages 1-25, October.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2512.22697. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.