IDEAS home Printed from https://ideas.repec.org/a/bla/jorssa/v185y2022i4p1903-1930.html
   My bibliography  Save this article

Nearest neighbour ratio imputation with incomplete multinomial outcome in survey sampling

Author

Listed:
  • Chenyin Gao
  • Katherine Jenny Thompson
  • Jae Kwang Kim
  • Shu Yang

Abstract

Nonresponse is a common problem in survey sampling. Appropriate treatment can be challenging, especially when dealing with detailed breakdowns of totals. Often, the nearest neighbour imputation method is used to handle such incomplete multinomial data. In this article, we investigate the nearest neighbour ratio imputation (NNRI) estimator, in which auxiliary variables are used to identify the closest donor and the vector of proportions from the donor is applied to the total of the recipient to implement ratio imputation. To estimate the asymptotic variance, we first treat the NNRI as a special case of predictive matching imputation and build on earlier work to linearize the imputed estimate. To account for the non‐negligible sampling fractions, parametric and generalized additive models are employed to incorporate the smoothness of the imputation estimator, which results in a valid variance estimator. We apply the proposed method to estimate expenditures detail items based on empirical data from the 2018 collection of the Service Annual Survey, conducted by the United States Census Bureau. Our simulation results demonstrate the validity of our proposed estimators and also confirm that the derived variance estimators have good performance even when the sampling fraction is non‐negligible.

Suggested Citation

  • Chenyin Gao & Katherine Jenny Thompson & Jae Kwang Kim & Shu Yang, 2022. "Nearest neighbour ratio imputation with incomplete multinomial outcome in survey sampling," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(4), pages 1903-1930, October.
  • Handle: RePEc:bla:jorssa:v:185:y:2022:i:4:p:1903-1930
    DOI: 10.1111/rssa.12841
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/rssa.12841
    Download Restriction: no

    File URL: https://libkey.io/10.1111/rssa.12841?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Simon N. Wood, 2003. "Thin plate regression splines," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 65(1), pages 95-114, February.
    2. Shu Yang & Jae Kwang Kim, 2020. "Asymptotic theory and inference of predictive mean matching imputation using a superpopulation model framework," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 47(3), pages 839-861, September.
    3. repec:mpr:mprres:4937 is not listed on IDEAS
    4. repec:mpr:mprres:4780 is not listed on IDEAS
    5. Ben B. Hansen, 2008. "The prognostic analogue of the propensity score," Biometrika, Biometrika Trust, vol. 95(2), pages 481-488.
    6. Simon N. Wood & Natalya Pya & Benjamin Säfken, 2016. "Smoothing Parameter and Model Selection for General Smooth Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1548-1563, October.
    7. Alberto Abadie & Guido W. Imbens, 2006. "Large Sample Properties of Matching Estimators for Average Treatment Effects," Econometrica, Econometric Society, vol. 74(1), pages 235-267, January.
    8. Shu Yang & Jae Kwang Kim, 2019. "Nearest Neighbor Imputation for General Parameter Estimation in Survey Sampling," Advances in Econometrics, in: The Econometrics of Complex Survey Data, volume 39, pages 209-234, Emerald Group Publishing Limited.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Shu Yang & Yunshu Zhang, 2023. "Multiply robust matching estimators of average and quantile treatment effects," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 50(1), pages 235-265, March.
    2. Georgios Gioldasis & Antonio Musolesi & Michel Simioni, 2020. "Model uncertainty, nonlinearities and out-of-sample comparison: evidence from international technology diffusion," Working Papers hal-02790523, HAL.
    3. E. Zanini & E. Eastoe & M. J. Jones & D. Randell & P. Jonathan, 2020. "Flexible covariate representations for extremes," Environmetrics, John Wiley & Sons, Ltd., vol. 31(5), August.
    4. Massimiliano Mazzanti & Antonio Musolesi, 2020. "Modeling Green Knowledge Production and Environmental Policies with Semiparametric Panel Data Regression models," SEEDS Working Papers 1420, SEEDS, Sustainability Environmental Economics and Dynamics Studies, revised Sep 2020.
    5. Gioldasis, Georgios & Musolesi, Antonio & Simioni, Michel, 2023. "Interactive R&D spillovers: An estimation strategy based on forecasting-driven model selection," International Journal of Forecasting, Elsevier, vol. 39(1), pages 144-169.
    6. Sun, Tianyu & Chand, Satish & Sharpe, Keiran, 2018. "Effect of aging on housing prices: evidence from a panel data," MPRA Paper 94418, University Library of Munich, Germany, revised 01 Mar 2019.
    7. Aloyce R. Kaliba & Anne G. Gongwe & Kizito Mazvimavi & Ashagre Yigletu, 2021. "Impact of Adopting Improved Seeds on Access to Broader Food Groups Among Small-Scale Sorghum Producers in Tanzania," SAGE Open, , vol. 11(1), pages 21582440209, January.
    8. Kneib, Thomas & Silbersdorff, Alexander & Säfken, Benjamin, 2023. "Rage Against the Mean – A Review of Distributional Regression Approaches," Econometrics and Statistics, Elsevier, vol. 26(C), pages 99-123.
    9. Roland R. Ramsahai, 2018. "Defining and estimating stochastic rate change in a dynamic general insurance portfolio," Papers 1810.10970, arXiv.org.
    10. Øystein Sørensen & Anders M. Fjell & Kristine B. Walhovd, 2023. "Longitudinal Modeling of Age-Dependent Latent Traits with Generalized Additive Latent and Mixed Models," Psychometrika, Springer;The Psychometric Society, vol. 88(2), pages 456-486, June.
    11. Cornelius Fritz & Göran Kauermann, 2022. "On the interplay of regional mobility, social connectedness and the spread of COVID‐19 in Germany," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(1), pages 400-424, January.
    12. Samuel D. Pimentel & Lauren Vollmer Forrow & Jonathan Gellar & Jiaqi Li, 2020. "Optimal matching approaches in health policy evaluations under rolling enrolment," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(4), pages 1411-1435, October.
    13. Joseph Antonelli & Matthew Cefalu & Nathan Palmer & Denis Agniel, 2018. "Doubly robust matching estimators for high dimensional confounding adjustment," Biometrics, The International Biometric Society, vol. 74(4), pages 1171-1179, December.
    14. de Luna, Xavier & Johansson, Per & Sjöstedt-de Luna, Sara, 2010. "Bootstrap Inference for K-Nearest Neighbour Matching Estimators," IZA Discussion Papers 5361, Institute of Labor Economics (IZA).
    15. Adam C. Sales & Ben B. Hansen & Brian Rowan, 2018. "Rebar: Reinforcing a Matching Estimator With Predictions From High-Dimensional Covariates," Journal of Educational and Behavioral Statistics, , vol. 43(1), pages 3-31, February.
    16. Georgios Gioldasis & Antonio Musolesi & Michel Simioni, 2020. "Model uncertainty, nonlinearities and out-of-sample comparison: evidence from international technology diffusion," SEEDS Working Papers 0120, SEEDS, Sustainability Environmental Economics and Dynamics Studies, revised Jan 2020.
    17. Harsh Parikh & Cynthia Rudin & Alexander Volfovsky, 2018. "MALTS: Matching After Learning to Stretch," Papers 1811.07415, arXiv.org, revised Jun 2023.
    18. Susan Athey & Raj Chetty & Guido Imbens & Hyunseung Kang, 2016. "Estimating Treatment Effects using Multiple Surrogates: The Role of the Surrogate Score and the Surrogate Index," Papers 1603.09326, arXiv.org, revised Apr 2024.
    19. Georgios Gioldasis & Antonio Musolesi & Michel Simioni, 2021. "Interactive R&D Spillovers: An estimation strategy based on forecasting-driven model selection," SEEDS Working Papers 0621, SEEDS, Sustainability Environmental Economics and Dynamics Studies, revised Jun 2021.
    20. Amara-Ouali, Yvenn & Fasiolo, Matteo & Goude, Yannig & Yan, Hui, 2023. "Daily peak electrical load forecasting with a multi-resolution approach," International Journal of Forecasting, Elsevier, vol. 39(3), pages 1272-1286.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jorssa:v:185:y:2022:i:4:p:1903-1930. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/rssssea.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.