IDEAS home Printed from https://ideas.repec.org/a/vrs/offsta/v35y2019i3p653-681n7.html
   My bibliography  Save this article

Supplementing Small Probability Samples with Nonprobability Samples: A Bayesian Approach

Author

Listed:
  • Sakshaug Joseph W.

    (University of Mannheim and Institute for Employment Research, Nuremberg, 90478Germany.)

  • Wiśniowski Arkadiusz
  • Ruiz Diego Andres Perez

    (University of Manchester, Manchester, M13 9PL United Kingdom.)

  • Blom Annelies G.

    (School of Social Sciences, University of Mannheim, Mannheim, 68131Germany.)

Abstract

Carefully designed probability-based sample surveys can be prohibitively expensive to conduct. As such, many survey organizations have shifted away from using expensive probability samples in favor of less expensive, but possibly less accurate, nonprobability web samples. However, their lower costs and abundant availability make them a potentially useful supplement to traditional probability-based samples. We examine this notion by proposing a method of supplementing small probability samples with nonprobability samples using Bayesian inference. We consider two semi-conjugate informative prior distributions for linear regression coefficients based on nonprobability samples, one accounting for the distance between maximum likelihood coefficients derived from parallel probability and non-probability samples, and the second depending on the variability and size of the nonprobability sample. The method is evaluated in comparison with a reference prior through simulations and a real-data application involving multiple probability and nonprobability surveys fielded simultaneously using the same questionnaire. We show that the method reduces the variance and mean-squared error (MSE) of coefficient estimates and model-based predictions relative to probability-only samples. Using actual and assumed cost data we also show that the method can yield substantial cost savings (up to 55%) for a fixed MSE.

Suggested Citation

  • Sakshaug Joseph W. & Wiśniowski Arkadiusz & Ruiz Diego Andres Perez & Blom Annelies G., 2019. "Supplementing Small Probability Samples with Nonprobability Samples: A Bayesian Approach," Journal of Official Statistics, Sciendo, vol. 35(3), pages 653-681, September.
  • Handle: RePEc:vrs:offsta:v:35:y:2019:i:3:p:653-681:n:7
    DOI: 10.2478/jos-2019-0027
    as

    Download full text from publisher

    File URL: https://doi.org/10.2478/jos-2019-0027
    Download Restriction: no

    File URL: https://libkey.io/10.2478/jos-2019-0027?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Malhotra, Neil & Krosnick, Jon A., 2007. "The Effect of Survey Mode and Sampling on Inferences about Political Attitudes and Behavior: Comparing the 2000 and 2004 ANES to Internet Surveys with Nonprobability Samples," Political Analysis, Cambridge University Press, vol. 15(3), pages 286-323, July.
    2. Ansolabehere, Stephen & Schaffner, Brian F., 2014. "Does Survey Mode Still Matter? Findings from a 2010 Multi-Mode Comparison," Political Analysis, Cambridge University Press, vol. 22(3), pages 285-303, July.
    3. Carl P. Schmertmann & Suzana M. Cavenaghi & Renato M. Assunção & Joseph E. Potter, 2013. "Bayes plus Brass: Estimating total fertility for many small areas from sparse census data," Population Studies, Taylor & Francis Journals, vol. 67(3), pages 255-273, November.
    4. Sturtz, Sibylle & Ligges, Uwe & Gelman, Andrew, 2005. "R2WinBUGS: A Package for Running WinBUGS from R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 12(i03).
    5. Stefano Marchetti & Caterina Giusti & Monica Pratesi, 2016. "The use of Twitter data to improve small area estimates of households’ share of food consumption expenditure in Italy [Die Nutzung von Twitter Daten um die Small Area Schätzungen vom Ausgabenanteil," AStA Wirtschafts- und Sozialstatistisches Archiv, Springer;Deutsche Statistische Gesellschaft - German Statistical Society, vol. 10(2), pages 79-93, October.
    6. David Briggs & Daniela Fecht & Kees De Hoogh, 2007. "Census data issues for epidemiology and health risk assessment: experiences from the Small Area Health Statistics Unit," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 170(2), pages 355-378, March.
    7. Wang, Wei & Rothschild, David & Goel, Sharad & Gelman, Andrew, 2015. "Forecasting elections with non-representative polls," International Journal of Forecasting, Elsevier, vol. 31(3), pages 980-991.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Kolcava, Dennis, 2020. "Do citizens hold business accountable for greenwashing by demanding more government intervention?," OSF Preprints sj4dk, Center for Open Science.
    2. Lachaud, Michée A. & Bravo-Ureta, Boris E., 2022. "A Bayesian statistical analysis of return to agricultural R&D investment in Latin America: Implications for food security," Technology in Society, Elsevier, vol. 70(C).
    3. Yu, Jun, 2012. "A semiparametric stochastic volatility model," Journal of Econometrics, Elsevier, vol. 167(2), pages 473-482.
    4. Kevin J. Boyle & Mark Morrison & Darla Hatton MacDonald & Roderick Duncan & John Rose, 2016. "Investigating Internet and Mail Implementation of Stated-Preference Surveys While Controlling for Differences in Sample Frames," Environmental & Resource Economics, Springer;European Association of Environmental and Resource Economists, vol. 64(3), pages 401-419, July.
    5. Mark Richard & Jan Vecer, 2021. "Efficiency Testing of Prediction Markets: Martingale Approach, Likelihood Ratio and Bayes Factor Analysis," Risks, MDPI, vol. 9(2), pages 1-20, February.
    6. Liang, Zhongyao & Qian, Song S. & Wu, Sifeng & Chen, Huili & Liu, Yong & Yu, Yanhong & Yi, Xuan, 2019. "Using Bayesian change point model to enhance understanding of the shifting nutrients-phytoplankton relationship," Ecological Modelling, Elsevier, vol. 393(C), pages 120-126.
    7. Lasse J. Jessen & Sebastian Koehne & Patrick Nüß & Jens Ruhose, 2024. "Socioeconomic Inequality in Life Expectancy: Perception and Policy Demand," CESifo Working Paper Series 10940, CESifo.
    8. Karytsas, Spyridon & Theodoropoulou, Helen, 2014. "Socioeconomic and demographic factors that influence publics' awareness on the different forms of renewable energy sources," Renewable Energy, Elsevier, vol. 71(C), pages 480-485.
    9. Tom Wilson & Irina Grossman & Monica Alexander & Phil Rees & Jeromey Temple, 2022. "Methods for Small Area Population Forecasts: State-of-the-Art and Research Needs," Population Research and Policy Review, Springer;Southern Demographic Association (SDA), vol. 41(3), pages 865-898, June.
    10. Qian Wu & Monique Vanerum & Anouk Agten & Andrés Christiansen & Frank Vandenabeele & Jean-Michel Rigo & Rianne Janssen, 2021. "Certainty-Based Marking on Multiple-Choice Items: Psychometrics Meets Decision Theory," Psychometrika, Springer;The Psychometric Society, vol. 86(2), pages 518-543, June.
    11. Horrocks, Julie & Rueffer, Matthew, 2014. "A Bayesian approach to estimating animal density from binary acoustic transects," Computational Statistics & Data Analysis, Elsevier, vol. 80(C), pages 17-25.
    12. Helbling, Marc & Jungkunz, Sebastian, 2020. "Social divides in the age of globalization," EconStor Open Access Articles and Book Chapters, ZBW - Leibniz Information Centre for Economics, vol. 43(6), pages 1187-1210.
    13. Aaron C. Sparks & Heather Hodges & Sarah Oliver & Eric R. A. N. Smith, 2020. "Confidence in Local, National, and International Scientists on Climate Change," Sustainability, MDPI, vol. 13(1), pages 1-13, December.
    14. Lindhjem, Henrik & Navrud, Ståle, 2011. "Using Internet in Stated Preference Surveys: A Review and Comparison of Survey Modes," International Review of Environmental and Resource Economics, now publishers, vol. 5(4), pages 309-351, September.
    15. Eugenia Buta & Stephanie S. O’Malley & Ralitza Gueorguieva, 2018. "Bayesian joint modelling of longitudinal data on abstinence, frequency and intensity of drinking in alcoholism trials," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 181(3), pages 869-888, June.
    16. Millington, James D.A. & Walters, Michael B. & Matonis, Megan S. & Liu, Jianguo, 2013. "Filling the gap: A compositional gap regeneration model for managed northern hardwood forests," Ecological Modelling, Elsevier, vol. 253(C), pages 17-27.
    17. Frijters, Paul & Barón, Juan D., 2009. "Do the Obese Really Die Younger or Do Health Expenditures Buy Them Extra Years?," IZA Discussion Papers 4149, Institute of Labor Economics (IZA).
    18. Piatak Jaclyn, 2023. "Do Sociocultural Factors Drive Civic Engagement? An Examination of Political Interest and Religious Attendance," Nonprofit Policy Forum, De Gruyter, vol. 14(2), pages 185-204, April.
    19. Temporão, Mickael & Dufresne, Yannick & Savoie, Justin & Linden, Clifton van der, 2019. "Crowdsourcing the vote: New horizons in citizen forecasting," International Journal of Forecasting, Elsevier, vol. 35(1), pages 1-10.
    20. Andrew Gelman & Christian Hennig, 2017. "Beyond subjective and objective in statistics," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 180(4), pages 967-1033, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:vrs:offsta:v:35:y:2019:i:3:p:653-681:n:7. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.sciendo.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.