How close is close enough? Evaluating propensity score matching using data from a class size reduction experiment

My bibliography Save this article

How close is close enough? Evaluating propensity score matching using data from a class size reduction experiment

Author

Listed:

Elizabeth Ty Wilde
(Princeton University)
Robinson Hollister
(Swarthmore College)

Registered:

Elizabeth Ty Wilde

Abstract

In recent years, propensity score matching (PSM) has gained attention as a potential method for estimating the impact of public policy programs in the absence of experimental evaluations. In this study, we evaluate the usefulness of PSM for estimating the impact of a program change in an educational context (Tennessee's Student Teacher Achievement Ratio Project [Project STAR]). Because Tennessee's Project STAR experiment involved an effective random assignment procedure, the experimental results from this policy intervention can be used as a benchmark, to which we compare the impact estimates produced using propensity score matching methods. We use several different methods to assess these nonexperimental estimates of the impact of the program. We try to determine “how close is close enough,” putting greatest emphasis on the question: Would the nonexperimental estimate have led to the wrong decision when compared to the experimental estimate of the program? We find that propensity score methods perform poorly with respect to measuring the impact of a reduction in class size on achievement test scores. We conclude that further research is needed before policymakers rely on PSM as an evaluation tool. © 2007 by the Association for Public Policy Analysis and Management

Suggested Citation

Elizabeth Ty Wilde & Robinson Hollister, 2007. "How close is close enough? Evaluating propensity score matching using data from a class size reduction experiment," Journal of Policy Analysis and Management, John Wiley & Sons, Ltd., vol. 26(3), pages 455-477.

Handle: RePEc:wly:jpamgt:v:26:y:2007:i:3:p:455-477
DOI: 10.1002/pam.20262

Download full text from publisher

References listed on IDEAS

Steven Glazerman & Dan M. Levy & David Myers, 2003. "Nonexperimental Versus Experimental Estimates of Earnings Impacts," The ANNALS of the American Academy of Political and Social Science, , vol. 589(1), pages 63-93, September.
- Steven Glazerman & Dan M. Levy & David Myers, "undated". "Nonexperimental Versus Experimental Estimates of Earnings Impacts," Mathematica Policy Research Reports 7c8bd68ac8db47caa57c70ee1, Mathematica Policy Research.
Charles Michalopoulos & Howard S. Bloom & Carolyn J. Hill, 2004. "Can Propensity-Score Methods Match the Findings from a Random Assignment Evaluation of Mandatory Welfare-to-Work Programs?," The Review of Economics and Statistics, MIT Press, vol. 86(1), pages 156-179, February.
Heckman, J.J. & Hotz, V.J., 1988. "Choosing Among Alternative Nonexperimental Methods For Estimating The Impact Of Social Programs: The Case Of Manpower Training," University of Chicago - Economics Research Center 88-12, Chicago - Economics Research Center.
- James J. Heckman, 1989. "Choosing Among Alternative Nonexperimental Methods for Estimating the Impact of Social Programs: The Case of Manpower Training," NBER Working Papers 2861, National Bureau of Economic Research, Inc.
A. Smith, Jeffrey & E. Todd, Petra, 2005. "Does matching overcome LaLonde's critique of nonexperimental estimators?," Journal of Econometrics, Elsevier, vol. 125(1-2), pages 305-353.
- Jeffrey Smith & Petra Todd, 2003. "Does Matching Overcome Lalonde's Critique of Nonexperimental Estimators?," University of Western Ontario, Centre for Human Capital and Productivity (CHCP) Working Papers 20035, University of Western Ontario, Centre for Human Capital and Productivity (CHCP).
Alan B. Krueger, 1999. "Experimental Estimates of Education Production Functions," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 114(2), pages 497-532.
- Alan B. Krueger, 1997. "Experimental Estimates of Education Production Functions," Working Papers 758, Princeton University, Department of Economics, Industrial Relations Section..
- Alan B. Krueger, 1997. "Experimental Estimates of Education Production Functions," NBER Working Papers 6051, National Bureau of Economic Research, Inc.
James Heckman & Hidehiko Ichimura & Jeffrey Smith & Petra Todd, 1998. "Characterizing Selection Bias Using Experimental Data," Econometrica, Econometric Society, vol. 66(5), pages 1017-1098, September.
- James Heckman & Hidehiko Ichimura & Jeffrey Smith & Petra Todd, 1998. "Characterizing Selection Bias Using Experimental Data," NBER Working Papers 6699, National Bureau of Economic Research, Inc.
Rouse, Cecilia Elena & Krueger, Alan B., 2004. "Putting computerized instruction to the test: a randomized evaluation of a "scientifically based" reading program," Economics of Education Review, Elsevier, vol. 23(4), pages 323-338, August.
- Cecilia E. Rouse & Alan B. Krueger & Lisa Markman, 2003. "Putting Computerized Instruction to the Test: A Randomized Evaluation of a "Scientifically-based" Reading Program," Working Papers 5, Princeton University, School of Public and International Affairs, Education Research Section..
- Cecilia E. Rouse & Alan B. Krueger, 2004. "Putting Computerized Instruction to the Test: A Randomized Evaluation of a "Scientifically-based" Reading Program," NBER Working Papers 10315, National Bureau of Economic Research, Inc.
Alberto Abadie & Guido W. Imbens, 2008. "On the Failure of the Bootstrap for Matching Estimators," Econometrica, Econometric Society, vol. 76(6), pages 1537-1557, November.
- Alberto Abadie & Guido W. Imbens, 2006. "On the Failure of the Bootstrap for Matching Estimators," NBER Technical Working Papers 0325, National Bureau of Economic Research, Inc.
- Imbens, Guido & Abadie, Alberto, 2008. "On the Failure of the Bootstrap for Matching Estimators," Scholarly Articles 3043415, Harvard University Department of Economics.
LaLonde, Robert J, 1986. "Evaluating the Econometric Evaluations of Training Programs with Experimental Data," American Economic Review, American Economic Association, vol. 76(4), pages 604-620, September.
- Robert J. LaLonde, 1984. "Evaluating the Econometric Evaluations of Training Programs with Experimental Data," Working Papers 563, Princeton University, Department of Economics, Industrial Relations Section..
James J. Heckman & Hidehiko Ichimura & Petra E. Todd, 1997. "Matching As An Econometric Evaluation Estimator: Evidence from Evaluating a Job Training Programme," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 64(4), pages 605-654.
Alberto Abadie & David Drukker & Jane Leber Herr & Guido W. Imbens, 2004. "Implementing matching estimators for average treatment effects in Stata," Stata Journal, StataCorp LP, vol. 4(3), pages 290-311, September.
Rajeev H. Dehejia & Sadek Wahba, 2002. "Propensity Score-Matching Methods For Nonexperimental Causal Studies," The Review of Economics and Statistics, MIT Press, vol. 84(1), pages 151-161, February.
- Rajeev H. Dehejia & Sadek Wahba, 1998. "Propensity Score Matching Methods for Non-experimental Causal Studies," NBER Working Papers 6829, National Bureau of Economic Research, Inc.
Roberto Agodini & Mark Dynarski, 2004. "Are Experiments the Only Option? A Look at Dropout Prevention Programs," The Review of Economics and Statistics, MIT Press, vol. 86(1), pages 180-194, February.
Friedlander, Daniel & Robins, Philip K, 1995. "Evaluating Program Evaluations: New Evidence on Commonly Used Nonexperimental Methods," American Economic Review, American Economic Association, vol. 85(4), pages 923-937, September.

Full references (including those not matched with items on IDEAS)

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Vivian C. Wong & Peter M. Steiner & Kylie L. Anglin, 2018. "What Can Be Learned From Empirical Evaluations of Nonexperimental Methods?," Evaluation Review, , vol. 42(2), pages 147-175, April.
Carlos A. Flores & Oscar A. Mitnik, 2009. "Evaluating Nonexperimental Estimators for Multiple Treatments: Evidence from Experimental Data," Working Papers 2010-10, University of Miami, Department of Economics.
- Carlos A. Flores & Oscar A. Mitnik, 2009. "Evaluating Nonexperimental Estimators for Multiple Treatments: Evidence from Experimental Data," Working Papers 2010-9, University of Miami, Department of Economics.
- Flores, Carlos A. & Mitnik, Oscar A., 2009. "Evaluating Nonexperimental Estimators for Multiple Treatments: Evidence from Experimental Data," IZA Discussion Papers 4451, Institute of Labor Economics (IZA).
Guido W. Imbens & Jeffrey M. Wooldridge, 2009. "Recent Developments in the Econometrics of Program Evaluation," Journal of Economic Literature, American Economic Association, vol. 47(1), pages 5-86, March.
- Guido M. Imbens & Jeffrey M. Wooldridge, 2008. "Recent Developments in the Econometrics of Program Evaluation," NBER Working Papers 14251, National Bureau of Economic Research, Inc.
- Wooldridge, Jeffrey M. & Imbens, Guido, 2009. "Recent Developments in the Econometrics of Program Evaluation," Scholarly Articles 3043416, Harvard University Department of Economics.
- Guido Imbens & Jeffrey M. Wooldridge, 2008. "Recent developments in the econometrics of program evaluation," CeMMAP working papers CWP24/08, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
- Imbens, Guido W. & Wooldridge, Jeffrey M., 2008. "Recent Developments in the Econometrics of Program Evaluation," IZA Discussion Papers 3640, Institute of Labor Economics (IZA).
Peter R. Mueser & Kenneth R. Troske & Alexey Gorislavsky, 2007. "Using State Administrative Data to Measure Program Performance," The Review of Economics and Statistics, MIT Press, vol. 89(4), pages 761-783, November.
- Peter R. Mueser & Kenneth Troske & Alexey Gorislavsky, 2003. "Using State Administrative Data to Measure Program Performance," Working Papers 0309, Department of Economics, University of Missouri.
- Mueser, Peter R. & Troske, Kenneth & Gorislavsky, Alexey, 2003. "Using State Administrative Data to Measure Program Performance," IZA Discussion Papers 786, Institute of Labor Economics (IZA).
- Peter R. Mueser & Kenneth R. Troske & Alexey Gorislavsky, 2007. "Using State Administrative Data to Measure Program Performance," Working Papers 0702, Department of Economics, University of Missouri.
Lechner, Michael & Wunsch, Conny, 2013. "Sensitivity of matching-based program evaluations to the availability of control variables," Labour Economics, Elsevier, vol. 21(C), pages 111-121.
- Lechner, Michael & Wunsch, Conny, 2011. "Sensitivity of Matching-Based Program Evaluations to the Availability of Control Variables," IZA Discussion Papers 5553, Institute of Labor Economics (IZA).
- Lechner, Michael & Wunsch, Conny, 2011. "Sensitivity of matching-based program evaluations to the availability of control variables," Economics Working Paper Series 1105, University of St. Gallen, School of Economics and Political Science.
- Michael Lechner & Conny Wunsch, 2011. "Sensitivity of Matching-Based Program Evaluations to the Availability of Control Variables," CESifo Working Paper Series 3381, CESifo.
- Lechner, Michael & Wunsch, Conny, 2011. "Sensitivity of matching-based program evaluations to the availability of control variables," CEPR Discussion Papers 8294, C.E.P.R. Discussion Papers.
Sudhanshu Handa & John A. Maluccio, 2010. "Matching the Gold Standard: Comparing Experimental and Nonexperimental Evaluation Techniques for a Geographically Targeted Program," Economic Development and Cultural Change, University of Chicago Press, vol. 58(3), pages 415-447, April.
- Sudhanshu Handa & John Maluccio, 2008. "Matching the gold standard: Comparing experimental and non-experimental evaluation techniques for a geographically targeted program," Middlebury College Working Paper Series 0813, Middlebury College, Department of Economics.
Steven Lehrer & Gregory Kordas, 2013. "Matching using semiparametric propensity scores," Empirical Economics, Springer, vol. 44(1), pages 13-45, February.
- Steven Lehrer & Gregory Kordas, 2004. "Matching using Semiparametric Propensity Scores," Econometric Society 2004 North American Summer Meetings 441, Econometric Society.
Justine Burns & Malcolm Kewsell & Rebecca Thornton, 2009. "Evaluating the Impact of Health Programmes," SALDRU Working Papers 40, Southern Africa Labour and Development Research Unit, University of Cape Town.
Richard P. Nathan, 2008. "The role of random assignment in social policy research," Journal of Policy Analysis and Management, John Wiley & Sons, Ltd., vol. 27(2), pages 401-415.
Andrew P. Jaciw, 2016. "Assessing the Accuracy of Generalized Inferences From Comparison Group Studies Using a Within-Study Comparison Approach," Evaluation Review, , vol. 40(3), pages 199-240, June.
Dettmann, E. & Becker, C. & Schmeißer, C., 2011. "Distance functions for matching in small samples," Computational Statistics & Data Analysis, Elsevier, vol. 55(5), pages 1942-1960, May.
Tymon Słoczyński, 2015. "The Oaxaca–Blinder Unexplained Component as a Treatment Effects Estimator," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 77(4), pages 588-604, August.
- Tymon Sloczynski, 2012. "The Oaxaca-Blinder unexplained component as a treatment effects estimator," Working Papers 61, Department of Applied Econometrics, Warsaw School of Economics.
- Słoczyński, Tymon, 2013. "The Oaxaca–Blinder Unexplained Component as a Treatment Effects Estimator," MPRA Paper 50660, University Library of Munich, Germany.
Tommaso Nannicini, 2007. "Simulation-based sensitivity analysis for matching estimators," Stata Journal, StataCorp LP, vol. 7(3), pages 334-350, September.
- Tommaso Nannicini, 2006. "A Simulation-Based Sensitivity Analysis for Matching Estimators," North American Stata Users' Group Meetings 2006 6, Stata Users Group.
- Tommaso Nannicini, 2009. "A simulation-based sensitivity analysis for matching estimators," Italian Stata Users' Group Meetings 2008 05, Stata Users Group.
A. Smith, Jeffrey & E. Todd, Petra, 2005. "Does matching overcome LaLonde's critique of nonexperimental estimators?," Journal of Econometrics, Elsevier, vol. 125(1-2), pages 305-353.
- Jeffrey Smith & Petra Todd, 2003. "Does Matching Overcome Lalonde's Critique of Nonexperimental Estimators?," University of Western Ontario, Centre for Human Capital and Productivity (CHCP) Working Papers 20035, University of Western Ontario, Centre for Human Capital and Productivity (CHCP).
Dettmann, Eva & Becker, Claudia & Schmeißer, Christian, 2010. "Is there a Superior Distance Function for Matching in Small Samples?," IWH Discussion Papers 3/2010, Halle Institute for Economic Research (IWH).
Marco Caliendo & Sabine Kopeinig, 2008. "Some Practical Guidance For The Implementation Of Propensity Score Matching," Journal of Economic Surveys, Wiley Blackwell, vol. 22(1), pages 31-72, February.
- Marco Caliendo & Sabine Kopeinig, 2005. "Some Practical Guidance for the Implementation of Propensity Score Matching," Discussion Papers of DIW Berlin 485, DIW Berlin, German Institute for Economic Research.
- Caliendo, Marco & Kopeinig, Sabine, 2005. "Some Practical Guidance for the Implementation of Propensity Score Matching," IZA Discussion Papers 1588, Institute of Labor Economics (IZA).
Ferraro, Paul J. & Miranda, Juan José, 2014. "The performance of non-experimental designs in the evaluation of environmental programs: A design-replication study using a large-scale randomized experiment as a benchmark," Journal of Economic Behavior & Organization, Elsevier, vol. 107(PA), pages 344-365.
V. Joseph Hotz & Guido W. Imbens & Jacob A. Klerman, 2006. "Evaluating the Differential Effects of Alternative Welfare-to-Work Training Components: A Reanalysis of the California GAIN Program," Journal of Labor Economics, University of Chicago Press, vol. 24(3), pages 521-566, July.
- V. Joseph Hotz & Guido W. Imbens & Jacob A. Klerman, 2006. "Evaluating the Differential Effects of Alternative Welfare-to-Work Training Components: A Re-Analysis of the California GAIN Program," NBER Working Papers 11939, National Bureau of Economic Research, Inc.
Kenneth Fortson & Philip Gleason & Emma Kopa & Natalya Verbitsky-Savitz, "undated". "Horseshoes, Hand Grenades, and Treatment Effects? Reassessing Bias in Nonexperimental Estimators," Mathematica Policy Research Reports 1c24988cd5454dd3be51fbc2c, Mathematica Policy Research.
Helena Holmlund & Olmo Silva, 2014. "Targeting Noncognitive Skills to Improve Cognitive Outcomes: Evidence from a Remedial Education Intervention," Journal of Human Capital, University of Chicago Press, vol. 8(2), pages 126-160.
- Holmlund, Helena & Silva, Olmo, 2009. "Targeting Non-Cognitive Skills to Improve Cognitive Outcomes: Evidence from a Remedial Education Intervention," IZA Discussion Papers 4476, Institute of Labor Economics (IZA).

More about this item

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wly:jpamgt:v:26:y:2007:i:3:p:455-477. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www3.interscience.wiley.com/journal/34787/home .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

How close is close enough? Evaluating propensity score matching using data from a class size reduction experiment

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Most related items

More about this item

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data