IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2603.24833.html

Robust Matrix Estimation with Side Information

Author

Listed:
  • Anish Agarwal
  • Jungjun Choi
  • Ming Yuan

Abstract

We introduce a flexible framework for high-dimensional matrix estimation to incorporate side information for both rows and columns. Existing approaches, such as inductive matrix completion, often impose restrictive structure-for example, an exact low-rank covariate interaction term, linear covariate effects, and limited ability to exploit components explained only by one side (row or column) or by neither-and frequently omit an explicit noise component. To address these limitations, we propose to decompose the underlying matrix as the sum of four complementary components: (possibly nonlinear) interaction between row and column characteristics; row characteristic-driven component, column characteristic-driven component, and residual low-rank structure unexplained by observed characteristics. By combining sieve-based projection with nuclear-norm penalization, each component can be estimated separately and these estimated components can then be aggregated to yield a final estimate. We derive convergence rates that highlight robustness across a range of model configurations depending on the informativeness of the side information. We further extend the method to partially observed matrices under both missing-at-random and missing-not-at-random mechanisms, including block-missing patterns motivated by causal panel data. Simulations and a real-data application to tobacco sales show that leveraging side information improves imputation accuracy and can enhance treatment-effect estimation relative to standard low-rank and spectral-based alternatives.

Suggested Citation

  • Anish Agarwal & Jungjun Choi & Ming Yuan, 2026. "Robust Matrix Estimation with Side Information," Papers 2603.24833, arXiv.org.
  • Handle: RePEc:arx:papers:2603.24833
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2603.24833
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Victor Chernozhukov & Christian Hansen & Yuan Liao & Yinchu Zhu, 2021. "Inference for Low-Rank Models," Papers 2107.02602, arXiv.org, revised Jan 2023.
    2. Yunzhang Zhu & Xiaotong Shen & Changqing Ye, 2016. "Personalized Prediction and Sparsity Pursuit in Latent Factor Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(513), pages 241-252, March.
    3. Qihui Chen & Nikolai Roussanov & Xiaoliang Wang, 2023. "Semiparametric Conditional Factor Models: Estimation and Inference," NBER Working Papers 31817, National Bureau of Economic Research, Inc.
    4. Abadie, Alberto & Diamond, Alexis & Hainmueller, Jens, 2010. "Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California’s Tobacco Control Program," Journal of the American Statistical Association, American Statistical Association, vol. 105(490), pages 493-505.
    5. Susan Athey & Mohsen Bayati & Nikolay Doudchenko & Guido Imbens & Khashayar Khosravi, 2021. "Matrix Completion Methods for Causal Panel Data Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(536), pages 1716-1730, October.
    6. Xiaojun Mao & Song Xi Chen & Raymond K. W. Wong, 2019. "Matrix Completion With Covariate Information," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(525), pages 198-210, January.
    7. Jushan Bai & Serena Ng, 2021. "Matrix Completion, Counterfactuals, and Factor Analysis of Missing Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(536), pages 1746-1763, October.
    8. Shujie Ma & Po-Yao Niu & Yichong Zhang & Yinchu Zhu, 2025. "Statistical Inference For Noisy Matrix Completion Incorporating Auxiliary Information," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 120(549), pages 343-355, January.
    9. Seung C. Ahn & Alex R. Horenstein, 2013. "Eigenvalue Ratio Test for the Number of Factors," Econometrica, Econometric Society, vol. 81(3), pages 1203-1227, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jungjun Choi & Ming Yuan, 2023. "Matrix Completion When Missing Is Not at Random and Its Applications in Causal Panel Data Models," Papers 2308.02364, arXiv.org.
    2. Xiong, Ruoxuan & Pelger, Markus, 2023. "Large dimensional latent factor modeling with missing observations and applications to causal inference," Journal of Econometrics, Elsevier, vol. 233(1), pages 271-301.
    3. Zhentao Shi & Jin Xi & Haitian Xie, 2025. "A Synthetic Business Cycle Approach to Counterfactual Analysis with Nonstationary Macroeconomic Data," Papers 2505.22388, arXiv.org.
    4. Alberto Abadie & Anish Agarwal & Raaz Dwivedi & Abhin Shah, 2024. "Doubly Robust Inference in Causal Latent Factor Models," Papers 2402.11652, arXiv.org, revised Oct 2024.
    5. Choi, Jungjun & Kwon, Hyukjun & Liao, Yuan, 2024. "Inference for low-rank completion without sample splitting with application to treatment effect estimation," Journal of Econometrics, Elsevier, vol. 240(1).
    6. Rafaty, Ryan & Dolphin, Geoffroy & Pretis, Felix, 2025. "Carbon pricing and the elasticity of CO2 emissions," Energy Economics, Elsevier, vol. 144(C).
    7. Ryan Rafaty & Geoffroy Dolphin & Felix Pretis, 2020. "Carbon pricing and the elasticity of CO2 emissions," Working Papers EPRG2035, Energy Policy Research Group, Cambridge Judge Business School, University of Cambridge.
    8. Retsef Levi & Elisabeth Paulson & Georgia Perakis & Emily Zhang, 2024. "Heterogeneous Treatment Effects in Panel Data," Papers 2406.05633, arXiv.org.
    9. Callaway, Brantly & Karami, Sonia, 2023. "Treatment effects in interactive fixed effects models with a small number of time periods," Journal of Econometrics, Elsevier, vol. 233(1), pages 184-208.
    10. Fatum, Rasmus & Yamamoto, Yohei & Chen, Binwei, 2025. "The trend effect of foreign exchange intervention," Journal of International Money and Finance, Elsevier, vol. 156(C).
    11. Luis Costa & Vivek F. Farias & Patricio Foncea & Jingyuan (Donna) Gan & Ayush Garg & Ivo Rosa Montenegro & Kumarjit Pathak & Tianyi Peng & Dusan Popovic, 2023. "Generalized Synthetic Control for TestOps at ABI: Models, Algorithms, and Infrastructure," Interfaces, INFORMS, vol. 53(5), pages 336-349, September.
    12. Su, Liangjun & Jin, Sainan & Wang, Xia, 2025. "Sieve estimation of state-varying factor models," Journal of Econometrics, Elsevier, vol. 251(C).
    13. Marc K. Chan & Simon S. Kwok, 2022. "The PCDID Approach: Difference-in-Differences When Trends Are Potentially Unparallel and Stochastic," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 40(3), pages 1216-1233, June.
    14. Taehyeon Koo & Zijian Guo, 2025. "Distributionally Robust Synthetic Control: Ensuring Robustness Against Highly Correlated Controls and Weight Shifts," Papers 2511.02632, arXiv.org, revised Jan 2026.
    15. Li, Xingyu & Shen, Yan & Zhou, Qiankun, 2024. "Confidence intervals of treatment effects in panel data models with interactive fixed effects," Journal of Econometrics, Elsevier, vol. 240(1).
    16. Alberto Abadie & Anish Agarwal & Devavrat Shah, 2025. "A Causal Inference Framework for Data Rich Environments," Papers 2504.01702, arXiv.org.
    17. Su, Liangjun & Wang, Fa, 2025. "Inference for large dimensional factor models under general missing data patterns," Journal of Econometrics, Elsevier, vol. 250(C).
    18. Nicolaj N. Mühlbach, 2020. "Tree-based Synthetic Control Methods: Consequences of moving the US Embassy," CREATES Research Papers 2020-04, Department of Economics and Business Economics, Aarhus University.
    19. Arne Henningsen & Guy Low & David Wuepper & Tobias Dalhaus & Hugo Storm & Dagim Belay & Stefan Hirsch, 2024. "Estimating Causal Effects with Observational Data: Guidelines for Agricultural and Applied Economists," IFRO Working Paper 2024/03, University of Copenhagen, Department of Food and Resource Economics.
    20. Davide Viviano & Jelena Bradic, 2019. "Synthetic learner: model-free inference on treatments over time," Papers 1904.01490, arXiv.org, revised Aug 2022.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2603.24833. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.