IDEAS home Printed from https://ideas.repec.org/p/arx/papers/2511.08303.html

Semi-Supervised Treatment Effect Estimation with Unlabeled Covariates via Generalized Riesz Regression

Author

Listed:
  • Masahiro Kato

Abstract

This study investigates treatment effect estimation in the semi-supervised setting, where we can use not only the standard triple of covariates, treatment indicator, and outcome, but also unlabeled auxiliary covariates. For this problem, we develop efficiency bounds and efficient estimators whose asymptotic variance aligns with the efficiency bound. In the analysis, we introduce two different data-generating processes: the one-sample setting and the two-sample setting. The one-sample setting considers the case where we can observe treatment indicators and outcomes for a part of the dataset, which is also called the censoring setting. In contrast, the two-sample setting considers two independent datasets with labeled and unlabeled data, which is also called the case-control setting or the stratified setting. In both settings, we find that by incorporating auxiliary covariates, we can lower the efficiency bound and obtain an estimator with an asymptotic variance smaller than that without such auxiliary covariates.

Suggested Citation

  • Masahiro Kato, 2025. "Semi-Supervised Treatment Effect Estimation with Unlabeled Covariates via Generalized Riesz Regression," Papers 2511.08303, arXiv.org.
  • Handle: RePEc:arx:papers:2511.08303
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/2511.08303
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. David Azriel & Lawrence D. Brown & Michael Sklar & Richard Berk & Andreas Buja & Linda Zhao, 2022. "Semi-Supervised Linear Regression," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 117(540), pages 2238-2251, October.
    2. Heckman, James J, 1974. "Shadow Prices, Market Wages, and Labor Supply," Econometrica, Econometric Society, vol. 42(4), pages 679-694, July.
    3. Imai, Kosuke & Strauss, Aaron, 2011. "Estimation of Heterogeneous Treatment Effects from Randomized Experiments, with Application to the Optimal Planning of the Get-Out-the-Vote Campaign," Political Analysis, Cambridge University Press, vol. 19(1), pages 1-19, January.
    4. Kennedy Edward H., 2020. "Efficient Nonparametric Causal Inference with Missing Exposure Information," The International Journal of Biostatistics, De Gruyter, vol. 16(1), pages 1-11, May.
    5. Stefan Wager & Susan Athey, 2018. "Estimation and Inference of Heterogeneous Treatment Effects using Random Forests," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1228-1242, July.
    6. Ruohan Zhan & Zhimei Ren & Susan Athey & Zhengyuan Zhou, 2024. "Policy Learning with Adaptively Collected Data," Management Science, INFORMS, vol. 70(8), pages 5270-5297, August.
    7. Hainmueller, Jens, 2012. "Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies," Political Analysis, Cambridge University Press, vol. 20(1), pages 25-46, January.
    8. Zhexiao Lin & Peng Ding & Fang Han, 2023. "Estimation Based on Nearest Neighbor Matching: From Density Ratio to Average Treatment Effect," Econometrica, Econometric Society, vol. 91(6), pages 2187-2217, November.
    9. Masahiro Kato, 2025. "Direct Bias-Correction Term Estimation for Propensity Scores and Average Treatment Effect Estimation," Papers 2509.22122, arXiv.org.
    10. Jinyong Hahn, 1998. "On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects," Econometrica, Econometric Society, vol. 66(2), pages 315-332, March.
    11. Imbens, Guido W. & Lancaster, Tony, 1996. "Efficient estimation and stratified sampling," Journal of Econometrics, Elsevier, vol. 74(2), pages 289-318, October.
    12. Imbens,Guido W. & Rubin,Donald B., 2015. "Causal Inference for Statistics, Social, and Biomedical Sciences," Cambridge Books, Cambridge University Press, number 9780521885881, January.
    13. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    14. Kennedy Edward H., 2020. "Efficient Nonparametric Causal Inference with Missing Exposure Information," The International Journal of Biostatistics, De Gruyter, vol. 16(1), pages 1-11, May.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Aditya Ghosh & Dominik Rothenhausler, 2025. "Assumption-robust Causal Inference," Papers 2505.08729, arXiv.org, revised Jun 2025.
    2. Lin, Zhexiao & Han, Fang, 2025. "On regression-adjusted imputation estimators of average treatment effects," Journal of Econometrics, Elsevier, vol. 251(C).
    3. Ganesh Karapakula, 2023. "Stable Probability Weighting: Large-Sample and Finite-Sample Estimation and Inference Methods for Heterogeneous Causal Effects of Multivalued Treatments Under Limited Overlap," Papers 2301.05703, arXiv.org, revised Jan 2023.
    4. Sung Jae Jun & Sokbae Lee, 2024. "Causal Inference Under Outcome-Based Sampling with Monotonicity Assumptions," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 42(3), pages 998-1009, July.
    5. Agboola, Oluwagbenga David & Yu, Han, 2023. "Neighborhood-based cross fitting approach to treatment effects with high-dimensional data," Computational Statistics & Data Analysis, Elsevier, vol. 186(C).
    6. David M. Ritzwoller & Vasilis Syrgkanis, 2024. "Simultaneous Inference for Local Structural Parameters with Random Forests," Papers 2405.07860, arXiv.org, revised Sep 2024.
    7. Sung Jae Jun & Sokbae (Simon) Lee, 2020. "Causal inference in case-control studies," CeMMAP working papers CWP19/20, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    8. Rahul Singh & Liyuan Xu & Arthur Gretton, 2020. "Kernel Methods for Causal Functions: Dose, Heterogeneous, and Incremental Response Curves," Papers 2010.04855, arXiv.org, revised Oct 2022.
    9. Phillip Heiler, 2022. "Efficient Covariate Balancing for the Local Average Treatment Effect," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 40(4), pages 1569-1582, October.
    10. Nathan Kallus, 2022. "What's the Harm? Sharp Bounds on the Fraction Negatively Affected by Treatment," Papers 2205.10327, arXiv.org, revised Nov 2022.
    11. Lundberg, Ian & Brand, Jennie E. & Jeon, Nanum, 2022. "Researcher reasoning meets computational capacity: Machine learning for social science," SocArXiv s5zc8, Center for Open Science.
    12. Zhexiao Lin & Fang Han, 2022. "On regression-adjusted imputation estimators of the average treatment effect," Papers 2212.05424, arXiv.org, revised Jan 2023.
    13. Paul B. Ellickson & Wreetabrata Kar & James C. Reeder, 2023. "Estimating Marketing Component Effects: Double Machine Learning from Targeted Digital Promotions," Marketing Science, INFORMS, vol. 42(4), pages 704-728, July.
    14. Athey, Susan & Imbens, Guido W. & Metzger, Jonas & Munro, Evan, 2024. "Using Wasserstein Generative Adversarial Networks for the design of Monte Carlo simulations," Journal of Econometrics, Elsevier, vol. 240(2).
    15. Nicolaj N. Mühlbach, 2020. "Tree-based Synthetic Control Methods: Consequences of moving the US Embassy," CREATES Research Papers 2020-04, Department of Economics and Business Economics, Aarhus University.
    16. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    17. Ruoxuan Xiong & Allison Koenecke & Michael Powell & Zhu Shen & Joshua T. Vogelstein & Susan Athey, 2021. "Federated Causal Inference in Heterogeneous Observational Data," Papers 2107.11732, arXiv.org, revised Apr 2023.
    18. Davide Viviano & Jelena Bradic, 2019. "Synthetic learner: model-free inference on treatments over time," Papers 1904.01490, arXiv.org, revised Aug 2022.
    19. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP54/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    20. Yiyi Huo & Yingying Fan & Fang Han, 2023. "On the adaptation of causal forests to manifold data," Papers 2311.16486, arXiv.org, revised Dec 2023.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2511.08303. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.