IDEAS home Printed from https://ideas.repec.org/a/bla/jorssc/v70y2021i3p750-769.html
   My bibliography  Save this article

Adjusting for population differences using machine learning methods

Author

Listed:
  • Lauren Cappiello
  • Zhiwei Zhang
  • Changyu Shen
  • Neel M. Butala
  • Xinping Cui
  • Robert W. Yeh

Abstract

The use of real‐world data for medical treatment evaluation frequently requires adjusting for population differences. We consider this problem in the context of estimating mean outcomes and treatment differences in a well‐defined target population, using clinical data from a study population that overlaps with but differs from the target population in terms of patient characteristics. The current literature on this subject includes a variety of statistical methods, which generally require correct specification of at least one parametric regression model. In this article, we propose to use machine learning methods to estimate nuisance functions and incorporate the machine learning estimates into existing doubly robust estimators. This leads to nonparametric estimators that are n‐consistent, asymptotically normal and asymptotically efficient under general conditions. Simulation results demonstrate that the proposed methods perform reasonably well in realistic settings. The methods are illustrated with a cardiology example concerning aortic stenosis.

Suggested Citation

  • Lauren Cappiello & Zhiwei Zhang & Changyu Shen & Neel M. Butala & Xinping Cui & Robert W. Yeh, 2021. "Adjusting for population differences using machine learning methods," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(3), pages 750-769, June.
  • Handle: RePEc:bla:jorssc:v:70:y:2021:i:3:p:750-769
    DOI: 10.1111/rssc.12486
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/rssc.12486
    Download Restriction: no

    File URL: https://libkey.io/10.1111/rssc.12486?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey, 2016. "Double machine learning for treatment and causal parameters," CeMMAP working papers 49/16, Institute for Fiscal Studies.
    2. Kara E. Rudolph & Mark J. Laan, 2017. "Robust estimation of encouragement design intervention effects transported across sites," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(5), pages 1509-1525, November.
    3. Edward H. Kennedy, 2019. "Nonparametric Causal Effects Based on Incremental Propensity Score Interventions," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(526), pages 645-656, April.
    4. Zhiwei Zhang & Lei Nie & Guoxing Soon & Zonghui Hu, 2016. "New methods for treatment effect calibration, with applications to non-inferiority trials," Biometrics, The International Biometric Society, vol. 72(1), pages 20-29, March.
    5. James Signorovitch & Eric Wu & Andrew Yu & Charles Gerrits & Evan Kantor & Yanjun Bao & Shiraz Gupta & Parvez Mulani, 2010. "Comparative Effectiveness Without Head-to-Head Trials," PharmacoEconomics, Springer, vol. 28(10), pages 935-945, October.
    6. Jinyong Hahn, 1998. "On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects," Econometrica, Econometric Society, vol. 66(2), pages 315-332, March.
    7. Hernan M. A & Brumback B. & Robins J. M, 2001. "Marginal Structural Models to Estimate the Joint Causal Effect of Nonrandomized Treatments," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 440-448, June.
    8. Elizabeth A. Stuart & Stephen R. Cole & Catherine P. Bradshaw & Philip J. Leaf, 2011. "The use of propensity scores to assess the generalizability of results from randomized trials," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 174(2), pages 369-386, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xinyu Li & Wang Miao & Fang Lu & Xiao‐Hua Zhou, 2023. "Improving efficiency of inference in clinical trials with external control data," Biometrics, The International Biometric Society, vol. 79(1), pages 394-403, March.
    2. Fan Li & Ashley L. Buchanan & Stephen R. Cole, 2022. "Generalizing trial evidence to target populations in non‐nested designs: Applications to AIDS clinical trials," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(3), pages 669-697, June.
    3. Kara E. Rudolph & Jonathan Levy & Mark J. van der Laan, 2021. "Transporting stochastic direct and indirect effects to new populations," Biometrics, The International Biometric Society, vol. 77(1), pages 197-211, March.
    4. Dasom Lee & Shu Yang & Lin Dong & Xiaofei Wang & Donglin Zeng & Jianwen Cai, 2023. "Improving trial generalizability using observational studies," Biometrics, The International Biometric Society, vol. 79(2), pages 1213-1225, June.
    5. Victor Chernozhukov & Vira Semenova, 2018. "Simultaneous inference for Best Linear Predictor of the Conditional Average Treatment Effect and other structural functions," CeMMAP working papers CWP40/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    6. Susan Athey & Stefan Wager, 2021. "Policy Learning With Observational Data," Econometrica, Econometric Society, vol. 89(1), pages 133-161, January.
    7. Susan Athey & Guido W. Imbens, 2017. "The State of Applied Econometrics: Causality and Policy Evaluation," Journal of Economic Perspectives, American Economic Association, vol. 31(2), pages 3-32, Spring.
    8. Jacqueline A. Mauro & Edward H. Kennedy & Daniel Nagin, 2020. "Instrumental variable methods using dynamic interventions," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(4), pages 1523-1551, October.
    9. Rui Chen & Guanhua Chen & Menggang Yu, 2023. "Entropy balancing for causal generalization with target sample summary information," Biometrics, The International Biometric Society, vol. 79(4), pages 3179-3190, December.
    10. Ranjbar, Setareh & Salvati, Nicola & Pacini, Barbara, 2023. "Estimating heterogeneous causal effects in observational studies using small area predictors," Computational Statistics & Data Analysis, Elsevier, vol. 184(C).
    11. Isaiah Andrews & Emily Oster, 2017. "A Simple Approximation for Evaluating External Validity Bias," NBER Working Papers 23826, National Bureau of Economic Research, Inc.
    12. Michael Lechner, 2004. "Sequential Matching Estimation of Dynamic Causal Models," University of St. Gallen Department of Economics working paper series 2004 2004-06, Department of Economics, University of St. Gallen.
    13. Victor Chernozhukov & Juan Carlos Escanciano & Hidehiko Ichimura & Whitney K. Newey & James M. Robins, 2022. "Locally Robust Semiparametric Estimation," Econometrica, Econometric Society, vol. 90(4), pages 1501-1535, July.
    14. Sookyo Jeong & Hongseok Namkoong, 2020. "Assessing External Validity Over Worst-case Subpopulations," Papers 2007.02411, arXiv.org, revised Feb 2022.
    15. Dettmann, E. & Becker, C. & Schmeißer, C., 2011. "Distance functions for matching in small samples," Computational Statistics & Data Analysis, Elsevier, vol. 55(5), pages 1942-1960, May.
    16. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    17. Alexander Hijzen & Sébastien Jean & Thierry Mayer, 2011. "The effects at home of initiating production abroad: evidence from matched French firms," Review of World Economics (Weltwirtschaftliches Archiv), Springer;Institut für Weltwirtschaft (Kiel Institute for the World Economy), vol. 147(3), pages 457-483, September.
    18. Sant’Anna, Pedro H.C. & Zhao, Jun, 2020. "Doubly robust difference-in-differences estimators," Journal of Econometrics, Elsevier, vol. 219(1), pages 101-122.
    19. Kitagawa, Toru & Muris, Chris, 2016. "Model averaging in semiparametric estimation of treatment effects," Journal of Econometrics, Elsevier, vol. 193(1), pages 271-289.
    20. Fabian Kosse & Thomas Deckers & Pia Pinger & Hannah Schildberg-Hörisch & Armin Falk, 2020. "The Formation of Prosociality: Causal Evidence on the Role of Social Environment," Journal of Political Economy, University of Chicago Press, vol. 128(2), pages 434-467.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssc:v:70:y:2021:i:3:p:750-769. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.