Propensity Score Methods for Causal Inference: On the Relative Importance of Covariate Selection, Reliable Measurement, and Choice of Propensity Score Technique

My bibliography Save this paper

Propensity Score Methods for Causal Inference: On the Relative Importance of Covariate Selection, Reliable Measurement, and Choice of Propensity Score Technique

Author

Listed:

Peter M. Steiner
(University of Wisconsin–Madison)

Registered:

Abstract

The popularity of propensity score (PS) methods for estimating causal treatment effects from observational studies has increased during the past decades. However, the success of these methods in removing selection bias mainly rests on strong assumptions, like the strong ignorability assumption, and the competent implementation of a specific propensity score technique. After giving a brief introduction to the Rubin Causal Model and different types of propensity score techniques, the paper assess the relative importance of three factors in removing selection bias in practice: (i) The availability of covariates that are related to both the selection process and the outcome under investigation; (ii) The reliability of the covariates’ measurements; And (iii) the choice of a specific analytic method for estimating the treatment effect—either a specific propensity score technique (PS matching, PS stratification, inverse-propensity weighting, and PS regression adjustment) or standard regression approaches. The importance of these three factors is investigated by reviewing different within-study comparisons and meta-analyses. Within-study comparisons enable an empirical assessment of PS methods’ performance in removing selection bias since they contrast the estimated treatment effect from an observational study with an estimate from a corresponding randomized experiment. The empirical evidence indicates that the selection of covariates counts most in reducing selection bias, their reliable measurement next most, and the mode of data analysis—either a specific propensity score technique or standard regression—is of least importance. Additional evidence suggests that the crucial strong ignorability assumption is most likely met if pretest measures of the outcome or constructs that directly determine the selection process are available and reliably measured.

Suggested Citation

Peter M. Steiner, 2011. "Propensity Score Methods for Causal Inference: On the Relative Importance of Covariate Selection, Reliable Measurement, and Choice of Propensity Score Technique," Working Papers 09, AlmaLaurea Inter-University Consortium.

Handle: RePEc:laa:wpaper:09

Download full text from publisher

References listed on IDEAS

LaLonde, Robert J, 1986. "Evaluating the Econometric Evaluations of Training Programs with Experimental Data," American Economic Review, American Economic Association, vol. 76(4), pages 604-620, September.
- Robert J. LaLonde, 1984. "Evaluating the Econometric Evaluations of Training Programs with Experimental Data," Working Papers 563, Princeton University, Department of Economics, Industrial Relations Section..
David S. Lee & Thomas Lemieux, 2009. "Regression Discontinuity Designs In Economics," Working Papers 1118, Princeton University, Department of Economics, Industrial Relations Section..
repec:mpr:mprres:3694 is not listed on IDEAS
David S. Lee & Thomas Lemieux, 2010. "Regression Discontinuity Designs in Economics," Journal of Economic Literature, American Economic Association, vol. 48(2), pages 281-355, June.
- David S. Lee & Thomas Lemieux, 2009. "Regression Discontinuity Designs in Economics," Working Papers 1118, Princeton University, Department of Economics, Industrial Relations Section..
- David S. Lee & Thomas Lemieux, 2009. "Regression Discontinuity Designs in Economics," NBER Working Papers 14723, National Bureau of Economic Research, Inc.
Heller, Ruth & Rosenbaum, Paul R. & Small, Dylan S., 2009. "Split Samples and Design Sensitivity in Observational Studies," Journal of the American Statistical Association, American Statistical Association, vol. 104(487), pages 1090-1101.
Steven Glazerman & Dan M. Levy & David Myers, 2003. "Nonexperimental Versus Experimental Estimates of Earnings Impacts," The ANNALS of the American Academy of Political and Social Science, , vol. 589(1), pages 63-93, September.
- Steven Glazerman & Dan M. Levy & David Myers, "undated". "Nonexperimental Versus Experimental Estimates of Earnings Impacts," Mathematica Policy Research Reports 7c8bd68ac8db47caa57c70ee1, Mathematica Policy Research.
Heckman, James, 2013. "Sample selection bias as a specification error," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 31(3), pages 129-137.
- Heckman, James J, 1979. "Sample Selection Bias as a Specification Error," Econometrica, Econometric Society, vol. 47(1), pages 153-161, January.
Shadish, William R. & Clark, M. H. & Steiner, Peter M., 2008. "Can Nonrandomized Experiments Yield Accurate Answers? A Randomized Experiment Comparing Random and Nonrandom Assignments," Journal of the American Statistical Association, American Statistical Association, vol. 103(484), pages 1334-1344.
Heckman, James J, 1974. "Shadow Prices, Market Wages, and Labor Supply," Econometrica, Econometric Society, vol. 42(4), pages 679-694, July.
Peter M. Steiner & Thomas D. Cook & William R. Shadish, 2011. "On the Importance of Reliable Covariate Measurement in Selection Bias Adjustments Using Propensity Scores," Journal of Educational and Behavioral Statistics, , vol. 36(2), pages 213-236, April.
Thomas D. Cook & William R. Shadish & Vivian C. Wong, 2008. "Three conditions under which experiments and observational studies produce comparable causal estimates: New findings from within-study comparisons," Journal of Policy Analysis and Management, John Wiley & Sons, Ltd., vol. 27(4), pages 724-750.
Guido W. Imbens, 2004. "Nonparametric Estimation of Average Treatment Effects Under Exogeneity: A Review," The Review of Economics and Statistics, MIT Press, vol. 86(1), pages 4-29, February.
- Guido W. Imbens, 2003. "Nonparametric Estimation of Average Treatment Effects under Exogeneity: A Review," NBER Technical Working Papers 0294, National Bureau of Economic Research, Inc.
Ho, Daniel E. & Imai, Kosuke & King, Gary & Stuart, Elizabeth A., 2007. "Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference," Political Analysis, Cambridge University Press, vol. 15(3), pages 199-236, July.
Juan Jose Diaz & Sudhanshu Handa, 2006. "An Assessment of Propensity Score Matching as a Nonexperimental Impact Estimator: Evidence from Mexico’s PROGRESA Program," Journal of Human Resources, University of Wisconsin Press, vol. 41(2).
- Díaz, Juan José & Handa, Sudhanshu, 2005. "An Assessment of Propensity Score Matching as a Non Experimental Impact Estimator: Evidence from Mexico's PROGRESA Program," IDB Publications (Working Papers) 2999, Inter-American Development Bank.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Robin Jacob & Marie-Andree Somers & Pei Zhu & Howard Bloom, 2016. "The Validity of the Comparative Interrupted Time Series Design for Evaluating the Effect of School-Level Interventions," Evaluation Review, , vol. 40(3), pages 167-198, June.
Henrik Hansen & Ninja Ritter Klejnstrup & Ole Winckler Andersen, 2011. "A Comparison of Model-based and Design-based Impact Evaluations of Interventions in Developing Countries," IFRO Working Paper 2011/16, University of Copenhagen, Department of Food and Resource Economics.
Vivian C. Wong & Peter M. Steiner & Kylie L. Anglin, 2018. "What Can Be Learned From Empirical Evaluations of Nonexperimental Methods?," Evaluation Review, , vol. 42(2), pages 147-175, April.
Travis St.Clair & Kelly Hallberg & Thomas D. Cook, 2016. "The Validity and Precision of the Comparative Interrupted Time-Series Design," Journal of Educational and Behavioral Statistics, , vol. 41(3), pages 269-299, June.
Katherine Baicker & Theodore Svoronos, 2019. "Testing the Validity of the Single Interrupted Time Series Design," NBER Working Papers 26080, National Bureau of Economic Research, Inc.
Flores, Carlos A. & Mitnik, Oscar A., 2009. "Evaluating Nonexperimental Estimators for Multiple Treatments: Evidence from Experimental Data," IZA Discussion Papers 4451, Institute of Labor Economics (IZA).
- Carlos A. Flores & Oscar A. Mitnik, 2009. "Evaluating Nonexperimental Estimators for Multiple Treatments: Evidence from Experimental Data," Working Papers 2010-9, University of Miami, Department of Economics.
- Carlos A. Flores & Oscar A. Mitnik, 2009. "Evaluating Nonexperimental Estimators for Multiple Treatments: Evidence from Experimental Data," Working Papers 2010-10, University of Miami, Department of Economics.
Jeffrey Smith & Arthur Sweetman, 2016. "Viewpoint: Estimating the causal effects of policies and programs," Canadian Journal of Economics, Canadian Economics Association, vol. 49(3), pages 871-905, August.
- Jeffrey Smith & Arthur Sweetman, 2016. "Viewpoint: Estimating the causal effects of policies and programs," Canadian Journal of Economics/Revue canadienne d'économique, John Wiley & Sons, vol. 49(3), pages 871-905, August.
- Smith, Jeffrey A. & Sweetman, Arthur, 2016. "Viewpoint: Estimating the Causal Effects of Policies and Programs," IZA Discussion Papers 10108, Institute of Labor Economics (IZA).
Ferraro, Paul J. & Miranda, Juan José, 2014. "The performance of non-experimental designs in the evaluation of environmental programs: A design-replication study using a large-scale randomized experiment as a benchmark," Journal of Economic Behavior & Organization, Elsevier, vol. 107(PA), pages 344-365.
Ben Weidmann & Luke Miratrix, 2021. "Lurking Inferential Monsters? Quantifying Selection Bias In Evaluations Of School Programs," Journal of Policy Analysis and Management, John Wiley & Sons, Ltd., vol. 40(3), pages 964-986, June.
Cousineau, Martin & Verter, Vedat & Murphy, Susan A. & Pineau, Joelle, 2023. "Estimating causal effects with optimization-based methods: A review and empirical comparison," European Journal of Operational Research, Elsevier, vol. 304(2), pages 367-380.
Jones A.M & Rice N, 2009. "Econometric Evaluation of Health Policies," Health, Econometrics and Data Group (HEDG) Working Papers 09/09, HEDG, c/o Department of Economics, University of York.
John Ammer & Sara B. Holland & David C. Smith & Francis E. Warnock, 2012. "U.S. International Equity Investment," Journal of Accounting Research, Wiley Blackwell, vol. 50(5), pages 1109-1139, December.
- John Ammer & Sara B. Holland & David C. Smith & Francis E. Warnock, 2012. "U.S. international equity investment," International Finance Discussion Papers 1044, Board of Governors of the Federal Reserve System (U.S.).
- John Ammer & Sara B. Holland & David C. Smith & Francis E. Warnock, 2012. "U.S. International Equity Investment," NBER Working Papers 17839, National Bureau of Economic Research, Inc.
Jason J. Sauppe & Sheldon H. Jacobson, 2017. "The role of covariate balance in observational studies," Naval Research Logistics (NRL), John Wiley & Sons, vol. 64(4), pages 323-344, June.
John Ammer & Sara B. Holland & David C. Smith & Francis E. Warnock, 2006. "Look at Me Now: What Attracts U.S. Shareholders?," NBER Working Papers 12500, National Bureau of Economic Research, Inc.
David M. Rindskopf & William R. Shadish & M. H. Clark, 2018. "Using Bayesian Correspondence Criteria to Compare Results From a Randomized Experiment and a Quasi-Experiment Allowing Self-Selection," Evaluation Review, , vol. 42(2), pages 248-280, April.
Fortson, Kenneth & Gleason, Philip & Kopa, Emma & Verbitsky-Savitz, Natalya, 2015. "Horseshoes, hand grenades, and treatment effects? Reassessing whether nonexperimental estimators are biased," Economics of Education Review, Elsevier, vol. 44(C), pages 100-113.
Bernard Black & Woochan Kim & Julia Nasev, 2021. "The Effect of Board Structure on Firm Disclosure and Behavior: A Case Study of Korea and a Comparison of Research Designs," Journal of Empirical Legal Studies, John Wiley & Sons, vol. 18(2), pages 328-376, June.
Katherine Baicker & Theodore Svoronos, 2019. "Testing the Validity of the Single Interrupted Time Series Design," CID Working Papers 364, Center for International Development at Harvard University.
Andrew P. Jaciw, 2016. "Applications of a Within-Study Comparison Approach for Evaluating Bias in Generalized Causal Inferences From Comparison Groups Studies," Evaluation Review, , vol. 40(3), pages 241-276, June.
Martin Cousineau & Vedat Verter & Susan A. Murphy & Joelle Pineau, 2022. "Estimating causal effects with optimization-based methods: A review and empirical comparison," Papers 2203.00097, arXiv.org.

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:laa:wpaper:09. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: http://www.almalaurea.it .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Propensity Score Methods for Causal Inference: On the Relative Importance of Covariate Selection, Reliable Measurement, and Choice of Propensity Score Technique

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data