IDEAS home Printed from https://ideas.repec.org/a/jss/jstsof/v025i11.html
   My bibliography  Save this article

Invariant and Metric Free Proximities for Data Matching: An R Package

Author

Listed:
  • Iacus, Stefano
  • Porro, Giuseppe

Abstract

Data matching is a typical statistical problem in non experimental and/or observational studies or, more generally, in cross-sectional studies in which one or more data sets are to be compared. Several methods are available in the literature, most of which based on a particular metric or on statistical models, either parametric or nonparametric. In this paper we present two methods to calculate a proximity which have the property of being invariant under monotonic transformations. These methods require at most the notion of ordering. An open-source software in the form of a R package is also presented.

Suggested Citation

  • Iacus, Stefano & Porro, Giuseppe, 2008. "Invariant and Metric Free Proximities for Data Matching: An R Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 25(i11).
  • Handle: RePEc:jss:jstsof:v:025:i11
    DOI: http://hdl.handle.net/10.18637/jss.v025.i11
    as

    Download full text from publisher

    File URL: https://www.jstatsoft.org/index.php/jss/article/view/v025i11/v25i11.pdf
    Download Restriction: no

    File URL: https://www.jstatsoft.org/index.php/jss/article/downloadSuppFile/v025i11/rrp_2.7.tar.gz
    Download Restriction: no

    File URL: https://www.jstatsoft.org/index.php/jss/article/downloadSuppFile/v025i11/v25i11.R
    Download Restriction: no

    File URL: https://libkey.io/http://hdl.handle.net/10.18637/jss.v025.i11?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. LaLonde, Robert J, 1986. "Evaluating the Econometric Evaluations of Training Programs with Experimental Data," American Economic Review, American Economic Association, vol. 76(4), pages 604-620, September.
    2. A. Smith, Jeffrey & E. Todd, Petra, 2005. "Does matching overcome LaLonde's critique of nonexperimental estimators?," Journal of Econometrics, Elsevier, vol. 125(1-2), pages 305-353.
    3. James J. Heckman & Hidehiko Ichimura & Petra E. Todd, 1997. "Matching As An Econometric Evaluation Estimator: Evidence from Evaluating a Job Training Programme," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 64(4), pages 605-654.
    4. Giuseppe Porro & Stefano Maria Iacus, 2009. "Random Recursive Partitioning: a matching method for the estimation of the average treatment effect," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 24(1), pages 163-185.
    5. Smith, Jeffrey & Todd, Petra, 2005. "Rejoinder," Journal of Econometrics, Elsevier, vol. 125(1-2), pages 365-375.
    6. Dehejia, Rajeev, 2005. "Practical propensity score matching: a reply to Smith and Todd," Journal of Econometrics, Elsevier, vol. 125(1-2), pages 355-364.
    7. Iacus, Stefano M. & Porro, Giuseppe, 2007. "Missing data imputation, matching and other applications of random recursive partitioning," Computational Statistics & Data Analysis, Elsevier, vol. 52(2), pages 773-789, October.
    8. James Heckman & Hidehiko Ichimura & Jeffrey Smith & Petra Todd, 1998. "Characterizing Selection Bias Using Experimental Data," Econometrica, Econometric Society, vol. 66(5), pages 1017-1098, September.
    9. Ben B. Hansen, 2004. "Full Matching in an Observational Study of Coaching for the SAT," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 609-618, January.
    10. Ho, Daniel E. & Imai, Kosuke & King, Gary & Stuart, Elizabeth A., 2007. "Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference," Political Analysis, Cambridge University Press, vol. 15(3), pages 199-236, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Doove, L.L. & Van Buuren, S. & Dusseldorp, E., 2014. "Recursive partitioning for missing data imputation in the presence of interaction effects," Computational Statistics & Data Analysis, Elsevier, vol. 72(C), pages 92-104.
    2. Humera Razzak & Christian Heumann, 2019. "Hybrid Multiple Imputation In A Large Scale Complex Survey," Statistics in Transition New Series, Polish Statistical Association, vol. 20(4), pages 33-58, December.
    3. Razzak Humera & Heumann Christian, 2019. "Hybrid Multiple Imputation In A Large Scale Complex Survey," Statistics in Transition New Series, Polish Statistical Association, vol. 20(4), pages 33-58, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Iacus, Stefano M. & Porro, Giuseppe, 2007. "Missing data imputation, matching and other applications of random recursive partitioning," Computational Statistics & Data Analysis, Elsevier, vol. 52(2), pages 773-789, October.
    2. Dettmann, E. & Becker, C. & Schmeißer, C., 2011. "Distance functions for matching in small samples," Computational Statistics & Data Analysis, Elsevier, vol. 55(5), pages 1942-1960, May.
    3. Giuseppe Porro & Stefano Maria Iacus, 2009. "Random Recursive Partitioning: a matching method for the estimation of the average treatment effect," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 24(1), pages 163-185.
    4. Giuseppe PORRO & Stefano Maria IACUS, 2004. "Average treatment effect estimation via random recursive partitioning," Departmental Working Papers 2004-28, Department of Economics, Management and Quantitative Methods at Università degli Studi di Milano.
    5. Steven Lehrer & Gregory Kordas, 2013. "Matching using semiparametric propensity scores," Empirical Economics, Springer, vol. 44(1), pages 13-45, February.
    6. Jose C. Galdo & Jeffrey Smith & Dan Black, 2008. "Bandwidth Selection and the Estimation of Treatment Effects with Unbalanced Data," Annals of Economics and Statistics, GENES, issue 91-92, pages 189-216.
    7. Huber, Martin & Lechner, Michael & Wunsch, Conny, 2013. "The performance of estimators based on the propensity score," Journal of Econometrics, Elsevier, vol. 175(1), pages 1-21.
    8. Ravallion, Martin, 2008. "Evaluating Anti-Poverty Programs," Handbook of Development Economics, in: T. Paul Schultz & John A. Strauss (ed.), Handbook of Development Economics, edition 1, volume 4, chapter 59, pages 3787-3846, Elsevier.
    9. Kevin Arceneaux & Alan S. Gerber & Donald P. Green, 2010. "A Cautionary Note on the Use of Matching to Estimate Causal Effects: An Empirical Example Comparing Matching Estimates to an Experimental Benchmark," Sociological Methods & Research, , vol. 39(2), pages 256-282, November.
    10. Flores, Carlos A. & Mitnik, Oscar A., 2009. "Evaluating Nonexperimental Estimators for Multiple Treatments: Evidence from Experimental Data," IZA Discussion Papers 4451, Institute of Labor Economics (IZA).
    11. Guido W. Imbens & Jeffrey M. Wooldridge, 2009. "Recent Developments in the Econometrics of Program Evaluation," Journal of Economic Literature, American Economic Association, vol. 47(1), pages 5-86, March.
    12. Jochen Kluve & Boris Augurzky, 2007. "Assessing the performance of matching algorithms when selection into treatment is strong," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 22(3), pages 533-557.
    13. Timothy Tyler Brown & Juan Pablo Atal, 2019. "How robust are reference pricing studies on outpatient medical procedures? Three different preprocessing techniques applied to difference‐in differences," Health Economics, John Wiley & Sons, Ltd., vol. 28(2), pages 280-298, February.
    14. Peter R. Mueser & Kenneth R. Troske & Alexey Gorislavsky, 2007. "Using State Administrative Data to Measure Program Performance," The Review of Economics and Statistics, MIT Press, vol. 89(4), pages 761-783, November.
    15. Ferraro, Paul J. & Miranda, Juan José, 2014. "The performance of non-experimental designs in the evaluation of environmental programs: A design-replication study using a large-scale randomized experiment as a benchmark," Journal of Economic Behavior & Organization, Elsevier, vol. 107(PA), pages 344-365.
    16. Kluve, Jochen & Lehmann, Hartmut & Schmidt, Christoph M., 2008. "Disentangling Treatment Effects of Active Labor Market Policies: The Role of Labor Force Status Sequences," Labour Economics, Elsevier, vol. 15(6), pages 1270-1295, December.
    17. repec:jss:jstsof:25:i11 is not listed on IDEAS
    18. Zhao, Zhong, 2008. "Sensitivity of propensity score methods to the specifications," Economics Letters, Elsevier, vol. 98(3), pages 309-319, March.
    19. Yonatan Eyal, 2020. "Self-Assessment Variables as a Source of Information in the Evaluation of Intervention Programs: A Theoretical and Methodological Framework," SAGE Open, , vol. 10(1), pages 21582440198, January.
    20. Wichman, Casey J. & Ferraro, Paul J., 2017. "A cautionary tale on using panel data estimators to measure program impacts," Economics Letters, Elsevier, vol. 151(C), pages 82-90.
    21. Huber, Martin & Lechner, Michael & Wunsch, Conny, 2010. "How to Control for Many Covariates? Reliable Estimators Based on the Propensity Score," IZA Discussion Papers 5268, Institute of Labor Economics (IZA).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:jss:jstsof:v:025:i11. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Christopher F. Baum (email available below). General contact details of provider: http://www.jstatsoft.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.