IDEAS home Printed from https://ideas.repec.org/p/laa/wpaper/09.html
   My bibliography  Save this paper

Propensity Score Methods for Causal Inference: On the Relative Importance of Covariate Selection, Reliable Measurement, and Choice of Propensity Score Technique

Author

Listed:
  • Peter M. Steiner

    (University of Wisconsin–Madison)

Abstract

The popularity of propensity score (PS) methods for estimating causal treatment effects from observational studies has increased during the past decades. However, the success of these methods in removing selection bias mainly rests on strong assumptions, like the strong ignorability assumption, and the competent implementation of a specific propensity score technique. After giving a brief introduction to the Rubin Causal Model and different types of propensity score techniques, the paper assess the relative importance of three factors in removing selection bias in practice: (i) The availability of covariates that are related to both the selection process and the outcome under investigation; (ii) The reliability of the covariates’ measurements; And (iii) the choice of a specific analytic method for estimating the treatment effect—either a specific propensity score technique (PS matching, PS stratification, inverse-propensity weighting, and PS regression adjustment) or standard regression approaches. The importance of these three factors is investigated by reviewing different within-study comparisons and meta-analyses. Within-study comparisons enable an empirical assessment of PS methods’ performance in removing selection bias since they contrast the estimated treatment effect from an observational study with an estimate from a corresponding randomized experiment. The empirical evidence indicates that the selection of covariates counts most in reducing selection bias, their reliable measurement next most, and the mode of data analysis—either a specific propensity score technique or standard regression—is of least importance. Additional evidence suggests that the crucial strong ignorability assumption is most likely met if pretest measures of the outcome or constructs that directly determine the selection process are available and reliably measured.

Suggested Citation

  • Peter M. Steiner, 2011. "Propensity Score Methods for Causal Inference: On the Relative Importance of Covariate Selection, Reliable Measurement, and Choice of Propensity Score Technique," Working Papers 09, AlmaLaurea Inter-University Consortium.
  • Handle: RePEc:laa:wpaper:09
    as

    Download full text from publisher

    File URL: http://www2.almalaurea.it/universita/pubblicazioni/wp/pdf/wp09.pdf
    File Function: First version, 2011
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. LaLonde, Robert J, 1986. "Evaluating the Econometric Evaluations of Training Programs with Experimental Data," American Economic Review, American Economic Association, vol. 76(4), pages 604-620, September.
    2. David S. Lee & Thomas Lemieux, 2009. "Regression Discontinuity Designs In Economics," Working Papers 1118, Princeton University, Department of Economics, Industrial Relations Section..
    3. repec:mpr:mprres:3694 is not listed on IDEAS
    4. David S. Lee & Thomas Lemieux, 2010. "Regression Discontinuity Designs in Economics," Journal of Economic Literature, American Economic Association, vol. 48(2), pages 281-355, June.
    5. Heller, Ruth & Rosenbaum, Paul R. & Small, Dylan S., 2009. "Split Samples and Design Sensitivity in Observational Studies," Journal of the American Statistical Association, American Statistical Association, vol. 104(487), pages 1090-1101.
    6. Steven Glazerman & Dan M. Levy & David Myers, 2003. "Nonexperimental Versus Experimental Estimates of Earnings Impacts," The ANNALS of the American Academy of Political and Social Science, , vol. 589(1), pages 63-93, September.
    7. Heckman, James, 2013. "Sample selection bias as a specification error," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 31(3), pages 129-137.
    8. Shadish, William R. & Clark, M. H. & Steiner, Peter M., 2008. "Can Nonrandomized Experiments Yield Accurate Answers? A Randomized Experiment Comparing Random and Nonrandom Assignments," Journal of the American Statistical Association, American Statistical Association, vol. 103(484), pages 1334-1344.
    9. Heckman, James J, 1974. "Shadow Prices, Market Wages, and Labor Supply," Econometrica, Econometric Society, vol. 42(4), pages 679-694, July.
    10. Peter M. Steiner & Thomas D. Cook & William R. Shadish, 2011. "On the Importance of Reliable Covariate Measurement in Selection Bias Adjustments Using Propensity Scores," Journal of Educational and Behavioral Statistics, , vol. 36(2), pages 213-236, April.
    11. Thomas D. Cook & William R. Shadish & Vivian C. Wong, 2008. "Three conditions under which experiments and observational studies produce comparable causal estimates: New findings from within-study comparisons," Journal of Policy Analysis and Management, John Wiley & Sons, Ltd., vol. 27(4), pages 724-750.
    12. Guido W. Imbens, 2004. "Nonparametric Estimation of Average Treatment Effects Under Exogeneity: A Review," The Review of Economics and Statistics, MIT Press, vol. 86(1), pages 4-29, February.
    13. Ho, Daniel E. & Imai, Kosuke & King, Gary & Stuart, Elizabeth A., 2007. "Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference," Political Analysis, Cambridge University Press, vol. 15(3), pages 199-236, July.
    14. Juan Jose Diaz & Sudhanshu Handa, 2006. "An Assessment of Propensity Score Matching as a Nonexperimental Impact Estimator: Evidence from Mexico’s PROGRESA Program," Journal of Human Resources, University of Wisconsin Press, vol. 41(2).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Robin Jacob & Marie-Andree Somers & Pei Zhu & Howard Bloom, 2016. "The Validity of the Comparative Interrupted Time Series Design for Evaluating the Effect of School-Level Interventions," Evaluation Review, , vol. 40(3), pages 167-198, June.
    2. Henrik Hansen & Ninja Ritter Klejnstrup & Ole Winckler Andersen, 2011. "A Comparison of Model-based and Design-based Impact Evaluations of Interventions in Developing Countries," IFRO Working Paper 2011/16, University of Copenhagen, Department of Food and Resource Economics.
    3. Vivian C. Wong & Peter M. Steiner & Kylie L. Anglin, 2018. "What Can Be Learned From Empirical Evaluations of Nonexperimental Methods?," Evaluation Review, , vol. 42(2), pages 147-175, April.
    4. Travis St.Clair & Kelly Hallberg & Thomas D. Cook, 2016. "The Validity and Precision of the Comparative Interrupted Time-Series Design," Journal of Educational and Behavioral Statistics, , vol. 41(3), pages 269-299, June.
    5. Katherine Baicker & Theodore Svoronos, 2019. "Testing the Validity of the Single Interrupted Time Series Design," NBER Working Papers 26080, National Bureau of Economic Research, Inc.
    6. Flores, Carlos A. & Mitnik, Oscar A., 2009. "Evaluating Nonexperimental Estimators for Multiple Treatments: Evidence from Experimental Data," IZA Discussion Papers 4451, Institute of Labor Economics (IZA).
    7. Jeffrey Smith & Arthur Sweetman, 2016. "Viewpoint: Estimating the causal effects of policies and programs," Canadian Journal of Economics, Canadian Economics Association, vol. 49(3), pages 871-905, August.
    8. Ferraro, Paul J. & Miranda, Juan José, 2014. "The performance of non-experimental designs in the evaluation of environmental programs: A design-replication study using a large-scale randomized experiment as a benchmark," Journal of Economic Behavior & Organization, Elsevier, vol. 107(PA), pages 344-365.
    9. Ben Weidmann & Luke Miratrix, 2021. "Lurking Inferential Monsters? Quantifying Selection Bias In Evaluations Of School Programs," Journal of Policy Analysis and Management, John Wiley & Sons, Ltd., vol. 40(3), pages 964-986, June.
    10. Cousineau, Martin & Verter, Vedat & Murphy, Susan A. & Pineau, Joelle, 2023. "Estimating causal effects with optimization-based methods: A review and empirical comparison," European Journal of Operational Research, Elsevier, vol. 304(2), pages 367-380.
    11. Jones A.M & Rice N, 2009. "Econometric Evaluation of Health Policies," Health, Econometrics and Data Group (HEDG) Working Papers 09/09, HEDG, c/o Department of Economics, University of York.
    12. John Ammer & Sara B. Holland & David C. Smith & Francis E. Warnock, 2012. "U.S. International Equity Investment," Journal of Accounting Research, Wiley Blackwell, vol. 50(5), pages 1109-1139, December.
    13. Jason J. Sauppe & Sheldon H. Jacobson, 2017. "The role of covariate balance in observational studies," Naval Research Logistics (NRL), John Wiley & Sons, vol. 64(4), pages 323-344, June.
    14. John Ammer & Sara B. Holland & David C. Smith & Francis E. Warnock, 2006. "Look at Me Now: What Attracts U.S. Shareholders?," NBER Working Papers 12500, National Bureau of Economic Research, Inc.
    15. David M. Rindskopf & William R. Shadish & M. H. Clark, 2018. "Using Bayesian Correspondence Criteria to Compare Results From a Randomized Experiment and a Quasi-Experiment Allowing Self-Selection," Evaluation Review, , vol. 42(2), pages 248-280, April.
    16. Fortson, Kenneth & Gleason, Philip & Kopa, Emma & Verbitsky-Savitz, Natalya, 2015. "Horseshoes, hand grenades, and treatment effects? Reassessing whether nonexperimental estimators are biased," Economics of Education Review, Elsevier, vol. 44(C), pages 100-113.
    17. Bernard Black & Woochan Kim & Julia Nasev, 2021. "The Effect of Board Structure on Firm Disclosure and Behavior: A Case Study of Korea and a Comparison of Research Designs," Journal of Empirical Legal Studies, John Wiley & Sons, vol. 18(2), pages 328-376, June.
    18. Katherine Baicker & Theodore Svoronos, 2019. "Testing the Validity of the Single Interrupted Time Series Design," CID Working Papers 364, Center for International Development at Harvard University.
    19. Andrew P. Jaciw, 2016. "Applications of a Within-Study Comparison Approach for Evaluating Bias in Generalized Causal Inferences From Comparison Groups Studies," Evaluation Review, , vol. 40(3), pages 241-276, June.
    20. Martin Cousineau & Vedat Verter & Susan A. Murphy & Joelle Pineau, 2022. "Estimating causal effects with optimization-based methods: A review and empirical comparison," Papers 2203.00097, arXiv.org.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:laa:wpaper:09. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: http://www.almalaurea.it .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.