IDEAS home Printed from https://ideas.repec.org/a/spr/jagbes/v23y2018i2d10.1007_s13253-018-0320-2.html
   My bibliography  Save this article

Combining Survey and Non-survey Data for Improved Sub-area Prediction Using a Multi-level Model

Author

Listed:
  • Jae Kwang Kim

    (Iowa State University)

  • Zhonglei Wang

    (Iowa State University)

  • Zhengyuan Zhu

    (Iowa State University)

  • Nathan B. Cruze

    (United States Department of Agriculture)

Abstract

Combining information from different sources is an important practical problem in survey sampling. Using a hierarchical area-level model, we establish a framework to integrate auxiliary information to improve state-level area estimates. The best predictors are obtained by the conditional expectations of latent variables given observations, and an estimate of the mean squared prediction error is discussed. Sponsored by the National Agricultural Statistics Service of the US Department of Agriculture, the proposed model is applied to the planted crop acreage estimation problem by combining information from three sources, including the June Area Survey obtained by a probability-based sampling of lands, administrative data about the planted acreage and the cropland data layer, which is a commodity-specific classification product derived from remote sensing data. The proposed model combines the available information at a sub-state level called the agricultural statistics district and aggregates to improve state-level estimates of planted acreages for different crops. Supplementary materials accompanying this paper appear on-line.

Suggested Citation

  • Jae Kwang Kim & Zhonglei Wang & Zhengyuan Zhu & Nathan B. Cruze, 2018. "Combining Survey and Non-survey Data for Improved Sub-area Prediction Using a Multi-level Model," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 23(2), pages 175-189, June.
  • Handle: RePEc:spr:jagbes:v:23:y:2018:i:2:d:10.1007_s13253-018-0320-2
    DOI: 10.1007/s13253-018-0320-2
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s13253-018-0320-2
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s13253-018-0320-2?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Giancarlo Manzi & David J. Spiegelhalter & Rebecca M. Turner & Julian Flowers & Simon G. Thompson, 2011. "Modelling bias in combining small area prevalence estimates from multiple surveys," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 174(1), pages 31-50, January.
    2. Jae Kwang Kim & J. N. K. Rao, 2012. "Combining data from two independent surveys: a model-assisted approach," Biometrika, Biometrika Trust, vol. 99(1), pages 85-100.
    3. Takis Merkouris, 2010. "Combining information from multiple surveys by using regression for efficient small domain estimation," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 72(1), pages 27-48, January.
    4. Siu-Ming Tam & Frederic Clarke, 2015. "Big Data, Official Statistics and Some Initiatives by the Australian Bureau of Statistics," International Statistical Review, International Statistical Institute, vol. 83(3), pages 436-448, December.
    5. Changbao Wu & Wilson W. Lu, 2016. "Calibration Weighting Methods for Complex Surveys," International Statistical Review, International Statistical Institute, vol. 84(1), pages 79-98, April.
    6. Takis Merkouris, 2004. "Combining Independent Regression Estimators From Multiple Surveys," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 1131-1139, December.
    7. Michael R. Elliott & William W. Davis, 2005. "Corrigendum: Obtaining cancer risk factor prevalence estimates in small areas: combining data from two surveys," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 54(5), pages 958-958, November.
    8. Raghunathan, Trivellore E. & Xie, Dawei & Schenker, Nathaniel & Parsons, Van L. & Davis, William W. & Dodd, Kevin W. & Feuer, Eric J., 2007. "Combining Information From Two Surveys to Estimate County-Level Prevalence Rates of Cancer Risk Factors and Screening," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 474-486, June.
    9. Michael R. Elliott & William W. Davis, 2005. "Obtaining cancer risk factor prevalence estimates in small areas: combining data from two surveys," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 54(3), pages 595-609, June.
    10. Jae Kwang Kim & Mingue Park, 2010. "Calibration Estimation in Survey Sampling," International Statistical Review, International Statistical Institute, vol. 78(1), pages 21-39, April.
    11. Torabi, Mahmoud & Rao, J.N.K., 2014. "On small area estimation under a sub-area level model," Journal of Multivariate Analysis, Elsevier, vol. 127(C), pages 36-55.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Erciulescu Andreea L. & Cruze Nathan B. & Nandram Balgobin, 2020. "Statistical Challenges in Combining Survey and Auxiliary Data to Produce Official Statistics," Journal of Official Statistics, Sciendo, vol. 36(1), pages 63-88, March.
    2. Camilla Salvatore, 2023. "Inference with non-probability samples and survey data integration: a science mapping study," METRON, Springer;Sapienza Università di Roma, vol. 81(1), pages 83-107, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Takis Merkouris, 2010. "Combining information from multiple surveys by using regression for efficient small domain estimation," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 72(1), pages 27-48, January.
    2. Yves G. Berger & Ewa Kabzińska, 2020. "Empirical Likelihood Approach for Aligning Information from Multiple Surveys," International Statistical Review, International Statistical Institute, vol. 88(1), pages 54-74, April.
    3. Seho Park & Jae Kwang Kim & Diana Stukel, 2017. "A measurement error model approach to survey data integration: combining information from two surveys," METRON, Springer;Sapienza Università di Roma, vol. 75(3), pages 345-357, December.
    4. Giancarlo Manzi & David J. Spiegelhalter & Rebecca M. Turner & Julian Flowers & Simon G. Thompson, 2011. "Modelling bias in combining small area prevalence estimates from multiple surveys," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 174(1), pages 31-50, January.
    5. Paolo Righi, 2016. "Estimation procedure and inference for component totals of the economic aggregates in the “Frame SBS”," Rivista di statistica ufficiale, ISTAT - Italian National Institute of Statistics - (Rome, ITALY), vol. 18(1), pages 83-97.
    6. Shixiao Zhang & Peisong Han & Changbao Wu, 2023. "Calibration Techniques Encompassing Survey Sampling, Missing Data Analysis and Causal Inference," International Statistical Review, International Statistical Institute, vol. 91(2), pages 165-192, August.
    7. Rasner, Anika & Frick, Joachim R. & Grabka, Markus M., 2013. "Statistical Matching of Administrative and Survey Data: An Application to Wealth Inequality Analysis," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 42(2), pages 192-224.
    8. Alessio Guandalini & Yves Tillé, 2017. "Design-based Estimators Calibrated on Estimated Totals from Multiple Surveys," International Statistical Review, International Statistical Institute, vol. 85(2), pages 250-269, August.
    9. Jae‐Kwang Kim & Siu‐Ming Tam, 2021. "Data Integration by Combining Big Data and Survey Sample Data for Finite Population Inference," International Statistical Review, International Statistical Institute, vol. 89(2), pages 382-401, August.
    10. Camilla Salvatore, 2023. "Inference with non-probability samples and survey data integration: a science mapping study," METRON, Springer;Sapienza Università di Roma, vol. 81(1), pages 83-107, April.
    11. Anika Rasner & Joachim R. Frick & Markus M. Grabka, 2013. "Statistical Matching of Administrative and Survey Data," Sociological Methods & Research, , vol. 42(2), pages 192-224, May.
    12. K. Shuvo Bakar & Nicholas Biddle & Philip Kokic & Huidong Jin, 2020. "A Bayesian spatial categorical model for prediction to overlapping geographical areas in sample surveys," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 183(2), pages 535-563, February.
    13. Corral Rodas,Paul Andres & Kastelic,Kristen Himelein & Mcgee,Kevin Robert & Molina,Isabel, 2021. "A Map of the Poor or a Poor Map ?," Policy Research Working Paper Series 9620, The World Bank.
    14. Cai Song & Rao J. N. K. & Dumitrescu Laura & Chatrchi Golshid, 2020. "Effective transformation-based variable selection under two-fold subarea models in small area estimation," Statistics in Transition New Series, Polish Statistical Association, vol. 21(4), pages 68-83, August.
    15. Iacus Stefano M. & Salini Silvia & Siletti Elena & Porro Giuseppe, 2020. "Controlling for Selection Bias in Social Media Indicators through Official Statistics: a Proposal," Journal of Official Statistics, Sciendo, vol. 36(2), pages 315-338, June.
    16. Denis Devaud & Yves Tillé, 2019. "Deville and Särndal’s calibration: revisiting a 25-years-old successful optimization problem," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(4), pages 1033-1065, December.
    17. Laureti Tiziana & Polidoro Federico, 2022. "Using Scanner Data for Computing Consumer Spatial Price Indexes at Regional Level: An Empirical Application for Grocery Products in Italy," Journal of Official Statistics, Sciendo, vol. 38(1), pages 23-56, March.
    18. Frauke Kreuter, 2013. "Facing the Nonresponse Challenge," The ANNALS of the American Academy of Political and Social Science, , vol. 645(1), pages 23-35, January.
    19. Gelein, Brigitte & Haziza, David & Causeur, David, 2014. "Preserving relationships between variables with MIVQUE based imputation for missing survey data," Journal of Multivariate Analysis, Elsevier, vol. 131(C), pages 197-208.
    20. Batana,Yele Maweki & Masaki,Takaaki & Nakamura,Shohei & Viboudoulou Vilpoux,Mervy Ever, 2021. "Estimating Poverty in Kinshasa by Dealing with Sampling and Comparability Issues," Policy Research Working Paper Series 9858, The World Bank.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:jagbes:v:23:y:2018:i:2:d:10.1007_s13253-018-0320-2. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.