IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0131765.html
   My bibliography  Save this article

Using Inverse Probability Bootstrap Sampling to Eliminate Sample Induced Bias in Model Based Analysis of Unequal Probability Samples

Author

Listed:
  • Matthew Nahorniak
  • David P Larsen
  • Carol Volk
  • Chris E Jordan

Abstract

In ecology, as in other research fields, efficient sampling for population estimation often drives sample designs toward unequal probability sampling, such as in stratified sampling. Design based statistical analysis tools are appropriate for seamless integration of sample design into the statistical analysis. However, it is also common and necessary, after a sampling design has been implemented, to use datasets to address questions that, in many cases, were not considered during the sampling design phase. Questions may arise requiring the use of model based statistical tools such as multiple regression, quantile regression, or regression tree analysis. However, such model based tools may require, for ensuring unbiased estimation, data from simple random samples, which can be problematic when analyzing data from unequal probability designs. Despite numerous method specific tools available to properly account for sampling design, too often in the analysis of ecological data, sample design is ignored and consequences are not properly considered. We demonstrate here that violation of this assumption can lead to biased parameter estimates in ecological research. In addition, to the set of tools available for researchers to properly account for sampling design in model based analysis, we introduce inverse probability bootstrapping (IPB). Inverse probability bootstrapping is an easily implemented method for obtaining equal probability re-samples from a probability sample, from which unbiased model based estimates can be made. We demonstrate the potential for bias in model-based analyses that ignore sample inclusion probabilities, and the effectiveness of IPB sampling in eliminating this bias, using both simulated and actual ecological data. For illustration, we considered three model based analysis tools—linear regression, quantile regression, and boosted regression tree analysis. In all models, using both simulated and actual ecological data, we found inferences to be biased, sometimes severely, when sample inclusion probabilities were ignored, while IPB sampling effectively produced unbiased parameter estimates.

Suggested Citation

  • Matthew Nahorniak & David P Larsen & Carol Volk & Chris E Jordan, 2015. "Using Inverse Probability Bootstrap Sampling to Eliminate Sample Induced Bias in Model Based Analysis of Unequal Probability Samples," PLOS ONE, Public Library of Science, vol. 10(6), pages 1-19, June.
  • Handle: RePEc:plo:pone00:0131765
    DOI: 10.1371/journal.pone.0131765
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0131765
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0131765&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0131765?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Stevens, Don L. & Olsen, Anthony R., 2004. "Spatially Balanced Sampling of Natural Resources," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 262-278, January.
    2. Stephen Howes & Jean Olson Lanjouw, 1998. "Does Sample Design Matter For Poverty Rate Comparisons?," Review of Income and Wealth, International Association for Research in Income and Wealth, vol. 44(1), pages 99-109, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Steven L Van Wilgenburg & C Lisa Mahon & Greg Campbell & Logan McLeod & Margaret Campbell & Dean Evans & Wendy Easton & Charles M Francis & Samuel Haché & Craig S Machtans & Caitlin Mader & Rhiannon F, 2020. "A cost efficient spatially balanced hierarchical sampling design for monitoring boreal birds incorporating access costs and habitat stratification," PLOS ONE, Public Library of Science, vol. 15(6), pages 1-28, June.
    2. Buil-Gil, David & Solymosi, Reka & Moretti, Angelo, 2019. "Non-parametric bootstrap and small area estimation to mitigate bias in crowdsourced data. Simulation study and application to perceived safety," SocArXiv 8hgjt, Center for Open Science.
    3. McHugh, Peter A. & Saunders, W. Carl & Bouwes, Nicolaas & Wall, C. Eric & Bangen, Sara & Wheaton, Joseph M. & Nahorniak, Matthew & Ruzycki, James R. & Tattam, Ian A. & Jordan, Chris E., 2017. "Linking models across scales to assess the viability and restoration potential of a threatened population of steelhead (Oncorhynchus mykiss) in the Middle Fork John Day River, Oregon, USA," Ecological Modelling, Elsevier, vol. 355(C), pages 24-38.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tomasz Bąk, 2021. "Spatial sampling methods modified by model use," Statistics in Transition New Series, Polish Statistical Association, vol. 22(2), pages 143-154, June.
    2. Lorenzo Fattorini & Timothy G. Gregoire & Sara Trentini, 2018. "The Use of Calibration Weighting for Variance Estimation Under Systematic Sampling: Applications to Forest Cover Assessment," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 23(3), pages 358-373, September.
    3. Jha,R., 2000. "Reducing Poverty and Inequality in India: Has Liberalization Helped?," Research Paper 204, World Institute for Development Economics Research.
    4. Tiehen, Laura & Jolliffe, Dean & Gundersen, Craig, 2012. "Alleviating Poverty in the United States: The Critical Role of SNAP Benefits," Economic Research Report 262233, United States Department of Agriculture, Economic Research Service.
    5. Pommerening, Arne & Szmyt, Janusz & Zhang, Gongqiao, 2020. "A new nearest-neighbour index for monitoring spatial size diversity: The hyperbolic tangent index," Ecological Modelling, Elsevier, vol. 435(C).
    6. Tim Goedemé & Karel Van den Bosch & Lina Salanauskaite & Gerlinde Verbist, 2013. "Testing the Statistical Significance of Microsimulation Results: Often Easier than You Think. A Technical Note," ImPRovE Working Papers 13/10, Herman Deleeck Centre for Social Policy, University of Antwerp.
    7. Stephen P. Jenkins & John Micklewright, 2007. "New Directions in the Analysis of Inequality and Poverty," Discussion Papers of DIW Berlin 700, DIW Berlin, German Institute for Economic Research.
    8. Wen-Hao Chen & Jean-Yves Duclos, 2011. "Testing for poverty dominance: an application to Canada," Canadian Journal of Economics, Canadian Economics Association, vol. 44(3), pages 781-803, August.
    9. Jolliffe,Dean Mitchell & Serajuddin,Umar & Jolliffe,Dean Mitchell & Serajuddin,Umar, 2015. "Estimating poverty with panel data, comparably : an example from Jordan," Policy Research Working Paper Series 7373, The World Bank.
    10. P. Jenkins, Stephen & Biewen, Martin, 2002. "Accounting for poverty differences between the United States, Great Britain and Germany," ISER Working Paper Series 2002-14, Institute for Social and Economic Research.
    11. Anton Grafström & Niklas L. P. Lundström & Lina Schelin, 2012. "Spatially Balanced Sampling through the Pivotal Method," Biometrics, The International Biometric Society, vol. 68(2), pages 514-520, June.
    12. Raphaël Jauslin & Bardia Panahbehagh & Yves Tillé, 2022. "Sequential spatially balanced sampling," Environmetrics, John Wiley & Sons, Ltd., vol. 33(8), December.
    13. Dean Jolliffe, 2001. "Estimating Sampling Variance from the Current Population Survey: A Synthetic Design Approach to Correcting Standard Errors," Econometrics 0110006, University Library of Munich, Germany, revised 20 Oct 2001.
    14. Xin Zhao & Anton Grafström, 2020. "A sample coordination method to monitor totals of environmental variables," Environmetrics, John Wiley & Sons, Ltd., vol. 31(6), September.
    15. Miguel Szekely & Nora Lustig & Martin Cumpa & Jose Antonio Mejia, 2004. "Do we know how much poverty there is?," Oxford Development Studies, Taylor & Francis Journals, vol. 32(4), pages 523-558.
    16. Tim Goedemé, 2013. "How much Confidence can we have in EU-SILC? Complex Sample Designs and the Standard Error of the Europe 2020 Poverty Indicators," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 110(1), pages 89-110, January.
    17. Christophe Muller, 2008. "The Measurement Of Poverty With Geographical And Intertemporal Price Dispersion: Evidence From Rwanda," Review of Income and Wealth, International Association for Research in Income and Wealth, vol. 54(1), pages 27-49, March.
    18. Thomas Demuynck & Dirk Van de gaer, 2012. "Inequality Adjusted Income Growth," Economica, London School of Economics and Political Science, vol. 79(316), pages 747-765, October.
    19. Huan Xie & Fang Wang & Yali Gong & Xiaohua Tong & Yanmin Jin & Ang Zhao & Chao Wei & Xinyi Zhang & Shicheng Liao, 2022. "Spatially Balanced Sampling for Validation of GlobeLand30 Using Landscape Pattern-Based Inclusion Probability," Sustainability, MDPI, vol. 14(5), pages 1-19, February.
    20. Linda Altieri & Daniela Cocchi, 2021. "Spatial Sampling for Non‐compact Patterns," International Statistical Review, International Statistical Institute, vol. 89(3), pages 532-549, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0131765. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.