IDEAS home Printed from https://ideas.repec.org/a/bla/jorssa/v184y2021i3p941-963.html
   My bibliography  Save this article

Combining non‐probability and probability survey samples through mass imputation

Author

Listed:
  • Jae Kwang Kim
  • Seho Park
  • Yilin Chen
  • Changbao Wu

Abstract

Analysis of non‐probability survey samples requires auxiliary information at the population level. Such information may also be obtained from an existing probability survey sample from the same finite population. Mass imputation has been used in practice for combining non‐probability and probability survey samples and making inferences on the parameters of interest using the information collected only in the non‐probability sample for the study variables. Under the assumption that the conditional mean function from the non‐probability sample can be transported to the probability sample, we establish the consistency of the mass imputation estimator and derive its asymptotic variance formula. Variance estimators are developed using either linearization or bootstrap. Finite sample performances of the mass imputation estimator are investigated through simulation studies. We also address important practical issues of the method through the analysis of a real‐world non‐probability survey sample collected by the Pew Research Centre.

Suggested Citation

  • Jae Kwang Kim & Seho Park & Yilin Chen & Changbao Wu, 2021. "Combining non‐probability and probability survey samples through mass imputation," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 184(3), pages 941-963, July.
  • Handle: RePEc:bla:jorssa:v:184:y:2021:i:3:p:941-963
    DOI: 10.1111/rssa.12696
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/rssa.12696
    Download Restriction: no

    File URL: https://libkey.io/10.1111/rssa.12696?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Richard K. Crump & V. Joseph Hotz & Guido W. Imbens & Oscar A. Mitnik, 2009. "Dealing with limited overlap in estimation of average treatment effects," Biometrika, Biometrika Trust, vol. 96(1), pages 187-199.
    2. Jae Kwang Kim & J. N. K. Rao, 2012. "Combining data from two independent surveys: a model-assisted approach," Biometrika, Biometrika Trust, vol. 99(1), pages 85-100.
    3. Shu Yang & Jae Kwang Kim, 2020. "Asymptotic theory and inference of predictive mean matching imputation using a superpopulation model framework," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 47(3), pages 839-861, September.
    4. Jae Kwang Kim, 2011. "Parametric fractional imputation for missing data analysis," Biometrika, Biometrika Trust, vol. 98(1), pages 119-132.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ieva Burakauskaitė & Andrius Čiginas, 2023. "An Approach to Integrating a Non-Probability Sample in the Population Census," Mathematics, MDPI, vol. 11(8), pages 1-14, April.
    2. Sixia Chen & Alexandra May Woodruff & Janis Campbell & Sara Vesely & Zheng Xu & Cuyler Snider, 2023. "Combining Probability and Nonprobability Samples by Using Multivariate Mass Imputation Approaches with Application to Biomedical Research," Stats, MDPI, vol. 6(2), pages 1-9, May.
    3. Chien-Min Huang & F. Jay Breidt, 2023. "A dual-frame approach for estimation with respondent-driven samples," METRON, Springer;Sapienza Università di Roma, vol. 81(1), pages 65-81, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Seho Park & Jae Kwang Kim & Diana Stukel, 2017. "A measurement error model approach to survey data integration: combining information from two surveys," METRON, Springer;Sapienza Università di Roma, vol. 75(3), pages 345-357, December.
    2. Gonzalez, Felipe & Prem, Mounu & von Dessauer, Cristine, 2023. "Empowerment or Indoctrination? Women Centers Under Dictatorship," SocArXiv 64mf9, Center for Open Science.
    3. Shu Yang & Jae Kwang Kim, 2016. "Likelihood-based Inference with Missing Data Under Missing-at-Random," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 43(2), pages 436-454, June.
    4. Dettmann, E. & Becker, C. & Schmeißer, C., 2011. "Distance functions for matching in small samples," Computational Statistics & Data Analysis, Elsevier, vol. 55(5), pages 1942-1960, May.
    5. Chen, Sixia & Haziza, David, 2023. "A unified framework of multiply robust estimation approaches for handling incomplete data," Computational Statistics & Data Analysis, Elsevier, vol. 179(C).
    6. Kitagawa, Toru & Muris, Chris, 2016. "Model averaging in semiparametric estimation of treatment effects," Journal of Econometrics, Elsevier, vol. 193(1), pages 271-289.
    7. Futoshi Yamauchi & Yanyan Liu, 2013. "Impacts of an Early Stage Education Intervention on Students' Learning Achievement: Evidence from the Philippines," Journal of Development Studies, Taylor & Francis Journals, vol. 49(2), pages 208-222, February.
    8. Susan Athey & Guido W. Imbens & Stefan Wager, 2018. "Approximate residual balancing: debiased inference of average treatment effects in high dimensions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(4), pages 597-623, September.
    9. Noémi Kreif & Richard Grieve & M. Zia Sadique, 2013. "Statistical Methods For Cost‐Effectiveness Analyses That Use Observational Data: A Critical Appraisal Tool And Review Of Current Practice," Health Economics, John Wiley & Sons, Ltd., vol. 22(4), pages 486-500, April.
    10. Kristiina Huttunen & Jarle Møen & Kjell G. Salvanes, 2018. "Job Loss and Regional Mobility," Journal of Labor Economics, University of Chicago Press, vol. 36(2), pages 479-509.
    11. Wagener, Andreas & Zenker, Juliane, 2018. "Decoupled but not neutral: The effects of stochastic transfers on investment and incomes in rural Thailand," TVSEP Working Papers wp-008, Leibniz Universitaet Hannover, Institute of Development and Agricultural Economics, Project TVSEP.
    12. Sallin, Aurelién, 2021. "Estimating returns to special education: combining machine learning and text analysis to address confounding," Economics Working Paper Series 2109, University of St. Gallen, School of Economics and Political Science.
    13. Takahashi, Ryo, 2021. "How to stimulate environmentally friendly consumption: Evidence from a nationwide social experiment in Japan to promote eco-friendly coffee," Ecological Economics, Elsevier, vol. 186(C).
    14. Marco Caliendo & Stefan Tübbicke, 2020. "New evidence on long-term effects of start-up subsidies: matching estimates and their robustness," Empirical Economics, Springer, vol. 59(4), pages 1605-1631, October.
    15. Victor Chernozhukov & Iván Fernández‐Val & Blaise Melly, 2013. "Inference on Counterfactual Distributions," Econometrica, Econometric Society, vol. 81(6), pages 2205-2268, November.
    16. Bodory, Hugo & Huber, Martin, 2018. "The causalweight package for causal inference in R," FSES Working Papers 493, Faculty of Economics and Social Sciences, University of Freiburg/Fribourg Switzerland.
    17. Caloffi, Annalisa & Freo, Marzia & Ghinoi, Stefano & Mariani, Marco & Rossi, Federica, 2022. "Assessing the effects of a deliberate policy mix: The case of technology and innovation advisory services and innovation vouchers," Research Policy, Elsevier, vol. 51(6).
    18. Tymon Słoczyński, 2015. "The Oaxaca–Blinder Unexplained Component as a Treatment Effects Estimator," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 77(4), pages 588-604, August.
    19. Mellace, Giovanni & Ventura, Marco, 2019. "Intended and unintended effects of public incentives for innovation. Quasi-experimental evidence from Italy," Discussion Papers on Economics 9/2019, University of Southern Denmark, Department of Economics.
    20. Janet Currie & Reed Walker, 2011. "Traffic Congestion and Infant Health: Evidence from E-ZPass," American Economic Journal: Applied Economics, American Economic Association, vol. 3(1), pages 65-90, January.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssa:v:184:y:2021:i:3:p:941-963. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.