IDEAS home Printed from https://ideas.repec.org/a/tsj/stataj/v16y2016i3p717-739.html
   My bibliography  Save this article

Implementing Rubin's alternative multiple-imputation method for statistical matching in Stata

Author

Listed:
  • Anil Alpman

    (Paris School of Economics)

Abstract

This article introduces two new commands, smpc and smmatch, that implement the statistical matching procedure proposed by Rubin (1986, Journal of Business and Economic Statistics 4: 87–94). The purpose of statistical matching in Rubin’s procedure is to generate a single dataset from various datasets, where each dataset contains a specific variable of interest and all contain some variables in common. For two variables of interest that are not observed jointly for any unit, smpc generates the predicted values of each as a function of the other vari- able of interest and a set of control variables by assuming a partial correlation value (defined by the user) between the two variables of interest (other statistical matching procedures assume that they are conditionally independent given the control variables). The smmatch command, on the other hand, matches observations of different datasets according to their predicted values (using a minimum distance criterion) conditional on a set of control variables, and it imputes the observed value of the match for the missing. Copyright 2016 by StataCorp LP.

Suggested Citation

  • Anil Alpman, 2016. "Implementing Rubin's alternative multiple-imputation method for statistical matching in Stata," Stata Journal, StataCorp LP, vol. 16(3), pages 717-739, September.
  • Handle: RePEc:tsj:stataj:v:16:y:2016:i:3:p:717-739
    Note: to access software from within Stata, net describe http://www.stata-journal.com/software/sj16-3/st0452/
    as

    Download full text from publisher

    File URL: http://www.stata-journal.com/article.html?article=st0452
    File Function: link to article purchase
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Rubin, Donald B, 1986. "Statistical Matching Using File Concatenation with Adjusted Weights and Multiple Imputations," Journal of Business & Economic Statistics, American Statistical Association, vol. 4(1), pages 87-94, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. François Gardes, 2021. "On the value of time and human life," Documents de travail du Centre d'Economie de la Sorbonne 21023, Université Panthéon-Sorbonne (Paris 1), Centre d'Economie de la Sorbonne.
    2. François Gardes, 2021. "Endogenous Prices in a Riemannian Geometry Framework," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) halshs-03325414, HAL.
    3. Armagan Tuna Aktuna-Gunes & Okay Gunes, 2017. "Measuring the Relative Domestic Production Scarcity of Time Spent in Domestic Activities for Turkey," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) halshs-01491982, HAL.
    4. François Gardes, 2021. "An Austrian Trade Cycle model with an Endogenous Value of Time," Documents de travail du Centre d'Economie de la Sorbonne 21025, Université Panthéon-Sorbonne (Paris 1), Centre d'Economie de la Sorbonne.
    5. Okay Gunes, 2017. "Analysis of Households' Decision Using Full Demand Elasticity Estimates: an Estimation on Turkish Data," Post-Print halshs-01491970, HAL.
    6. François Gardes, 2021. "Endogenous Prices in a Riemannian Geometry Framework," Post-Print halshs-03325414, HAL.
    7. François Gardes, 2018. "On the value of time and human life," Documents de travail du Centre d'Economie de la Sorbonne 18028, Université Panthéon-Sorbonne (Paris 1), Centre d'Economie de la Sorbonne.
    8. François Gardes, 2021. "A Solution to the Estimation of an Enlarged GDP Including Domestic Production: An Estimation on Micro Data," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) halshs-03325362, HAL.
    9. Anil Alpman & François Gardes, 2016. "Welfare Analysis of the Allocation of Time During the Great Recession," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) halshs-01159507, HAL.
    10. François Gardes, 2021. "A Solution to the Estimation of an Enlarged GDP Including Domestic Production: An Estimation on Micro Data," Post-Print halshs-03325362, HAL.
    11. François Gardes, 2018. "On the value of time and human life," Post-Print halshs-01903596, HAL.
    12. François Gardes, 2021. "A Solution to the estimation of an Enlarged GDP Including Domestic Production: An Estimation on Micro Data," Documents de travail du Centre d'Economie de la Sorbonne 21024, Université Panthéon-Sorbonne (Paris 1), Centre d'Economie de la Sorbonne.
    13. François Gardes, 2021. "An Austrian Trade Cycle model with an Endogenous Value of Time," Post-Print halshs-03325379, HAL.
    14. François Gardes, 2021. "On the value of time and human life," Post-Print halshs-03325332, HAL.
    15. Okay Gunes, 2017. "Analysis of Households' Decision Using Full Demand Elasticity Estimates: an Estimation on Turkish Data," Documents de travail du Centre d'Economie de la Sorbonne 17017, Université Panthéon-Sorbonne (Paris 1), Centre d'Economie de la Sorbonne.
    16. Armagan Tuna Aktuna-Gunes & Okay Gunes, 2017. "Measuring the Relative Domestic Production Scarcity of Time Spent in Domestic Activities for Turkey," Documents de travail du Centre d'Economie de la Sorbonne 17018, Université Panthéon-Sorbonne (Paris 1), Centre d'Economie de la Sorbonne.
    17. François Gardes, 2021. "An Austrian Trade Cycle model with an Endogenous Value of Time," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) halshs-03325379, HAL.
    18. Anil Alpman & François Gardes, 2016. "Welfare Analysis of the Allocation of Time During the Great Recession," Post-Print halshs-01159507, HAL.
    19. François Gardes, 2021. "Endogenous Prices in a Riemannian Geometry Framework," Documents de travail du Centre d'Economie de la Sorbonne 21026, Université Panthéon-Sorbonne (Paris 1), Centre d'Economie de la Sorbonne.
    20. François Gardes, 2021. "On the value of time and human life," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) halshs-03325332, HAL.
    21. François Gardes, 2018. "On the value of time and human life," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) halshs-01903596, HAL.
    22. Okay Gunes, 2017. "Analysis of Households' Decision Using Full Demand Elasticity Estimates: an Estimation on Turkish Data," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) halshs-01491970, HAL.
    23. Anil Alpman & François Gardes, 2015. "Welfare Analysis of the Allocation of Time During the Great Recession," Documents de travail du Centre d'Economie de la Sorbonne 15012, Université Panthéon-Sorbonne (Paris 1), Centre d'Economie de la Sorbonne, revised Mar 2016.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. François Gardes, 2021. "A Solution to the Estimation of an Enlarged GDP Including Domestic Production: An Estimation on Micro Data," Post-Print halshs-03325362, HAL.
    2. Joost Ginkel & Pieter Kroonenberg, 2014. "Using Generalized Procrustes Analysis for Multiple Imputation in Principal Component Analysis," Journal of Classification, Springer;The Classification Society, vol. 31(2), pages 242-269, July.
    3. Peter ven de Ven & Anne Harrison & Barbara Fraumeni & Dennis Fixler & David Johnson & Andrew Craig & Kevin Furlong, 2017. "A Consistent Data Series to Evaluate Growth and Inequality in the National Accounts Note: The views expressed in this research, including those related to statistical, methodological, technical, or op," Review of Income and Wealth, International Association for Research in Income and Wealth, vol. 63, pages 437-459, December.
    4. Chenyang Gu & Roee Gutman, 2017. "Combining item response theory with multiple imputation to equate health assessment questionnaires," Biometrics, The International Biometric Society, vol. 73(3), pages 990-998, September.
    5. Michael S. Rendall & Bonnie Ghosh-Dastidar & Margaret M. Weden & Zafar Nazarov, 2011. "Multiple Imputation for Combined-Survey Estimation With Incomplete Regressors In One But Not Both Surveys," Working Papers WR-887-1, RAND Corporation.
    6. Joost R. Ginkel, 2020. "Standardized Regression Coefficients and Newly Proposed Estimators for $${R}^{{2}}$$R2 in Multiply Imputed Data," Psychometrika, Springer;The Psychometric Society, vol. 85(1), pages 185-205, March.
    7. Anil Alpman, 2015. "Implementing Rubin's Alternative Multiple Imputation Method for Statistical Matching in Stata," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) hal-01159191, HAL.
    8. Rässler, Susanne & Schnell, Rainer, 2004. "Multiple imputation for unit-nonresponse versus weighting including a comparison with a nonresponse follow-up study," Discussion Papers 65/2004, Friedrich-Alexander University Erlangen-Nuremberg, Chair of Statistics and Econometrics.
    9. Arif Mamun & Ankita Patnaik & Michael Levere & Gina Livermore & Todd Honeycutt & Jacqueline Kauff & Karen Katz & AnnaMaria McCutcheon & Joseph Mastrianni & Brittney Gionfriddo, "undated". "Promoting Readiness of Minors in Supplemental Security Income (PROMISE): Technical Appendix to the Interim Services and Impact Report," Mathematica Policy Research Reports 24c37444a21d4046abb21395a, Mathematica Policy Research.
    10. Wu, Meng-Wen & Shen, Chung-Hua, 2013. "Corporate social responsibility in the banking industry: Motives and financial performance," Journal of Banking & Finance, Elsevier, vol. 37(9), pages 3529-3547.
    11. Hao Dong & Daniel L. Millimet, 2020. "Propensity Score Weighting with Mismeasured Covariates: An Application to Two Financial Literacy Interventions," JRFM, MDPI, vol. 13(11), pages 1-24, November.
    12. Gessendorfer Jonathan & Beste Jonas & Drechsler Jörg & Sakshaug Joseph W., 2018. "Statistical Matching as a Supplement to Record Linkage: A Valuable Method to Tackle Nonconsent Bias?," Journal of Official Statistics, Sciendo, vol. 34(4), pages 909-933, December.
    13. Chia-Ning Wang & Roderick Little & Bin Nan & Siobán D. Harlow, 2011. "A Hot-Deck Multiple Imputation Procedure for Gaps in Longitudinal Recurrent Event Histories," Biometrics, The International Biometric Society, vol. 67(4), pages 1573-1582, December.
    14. Pier Luigi Conti & Daniela Marella & Andrea Neri, 2017. "Statistical matching and uncertainty analysis in combining household income and expenditure data," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 26(3), pages 485-505, August.
    15. Andrea Cutillo & Mauro Scanu, 2020. "A Mixed Approach for Data Fusion of HBS and SILC," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 150(2), pages 411-437, July.
    16. Brownstone, David, 1997. "Multiple Imputation Methodology for Missing Data, Non-Random Response, and Panel Attrition," University of California Transportation Center, Working Papers qt2zd6w6hh, University of California Transportation Center.
    17. Lamarche, Pierre, 2017. "Estimating consumption in the HFCS: Experimental results on the first wave of the HFCS," Statistics Paper Series 22, European Central Bank.
    18. Keane, Michael & Stavrunova, Olena, 2016. "Adverse selection, moral hazard and the demand for Medigap insurance," Journal of Econometrics, Elsevier, vol. 190(1), pages 62-78.
    19. Schenker, Nathaniel & Taylor, Jeremy M. G., 1996. "Partially parametric techniques for multiple imputation," Computational Statistics & Data Analysis, Elsevier, vol. 22(4), pages 425-446, August.
    20. Westermeier, Christian & Grabka, Markus M., 2016. "Longitudinal Wealth Data and Multiple Imputation: An Evaluation Study," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, pages 237-252.

    More about this item

    Keywords

    smmatch; smpc; data combination; missing data; multiple imputation; statistical matching;
    All these keywords.

    JEL classification:

    • C10 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - General
    • C39 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables - - - Other
    • C53 - Mathematical and Quantitative Methods - - Econometric Modeling - - - Forecasting and Prediction Models; Simulation Methods

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:tsj:stataj:v:16:y:2016:i:3:p:717-739. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: . General contact details of provider: http://www.stata-journal.com/ .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Christopher F. Baum or Lisa Gilmore (email available below). General contact details of provider: http://www.stata-journal.com/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.