IDEAS home Printed from
   My bibliography  Save this article

Estimating species distributions from spatially biased citizen science data


  • Johnston, Alison
  • Moran, Nick
  • Musgrove, Andy
  • Fink, Daniel
  • Baillie, Stephen R.


Ecological citizen science data are rapidly growing in availability and use in ecology and conservation. Many citizen science projects have the flexibility for participants to select where they survey, resulting in more participants, but also spatially biased data. It is important to assess the extent to which these spatially biased data can provide reliable estimates of species distributions. Here we quantify the extent of site selection bias in a citizen science project and the implications of this spatial bias in species distribution models. Using data from the BirdTrack citizen science project in Great Britain from 2007 to 2011, we modelled the spatial bias of data submissions. We next produced species occupancy models for 138 bird species, and assessed the impact of accounting for spatial bias. We compared the distributions to those produced using unbiased data from an Atlas survey from the same region and time period. Averaging across 138 species, models with spatially biased data produced accurate and precise estimates of species occupancy for most locations in Great Britain. However, these distributions were both less accurate and less precise in the Scottish Highlands, showing on average a positive bias. Accounting for the spatially biased sampling with weights led to on average greater accuracy in the Scottish Highlands, but did not increase precision. This region is both distinct in environmental characteristics and has a low density of observations, making it difficult to characterise environmental relationships with species occupancy. Accounting for the spatially biased sampling did not affect average accuracy or precision throughout most of the country. Spatially biased citizen science data can be used to estimate species occupancy in regions with stationary environmental relationships and good sampling across environmental space. The reliability of estimated species distributions from spatially biased data should be further validated and tested under a range of different scenarios.

Suggested Citation

  • Johnston, Alison & Moran, Nick & Musgrove, Andy & Fink, Daniel & Baillie, Stephen R., 2020. "Estimating species distributions from spatially biased citizen science data," Ecological Modelling, Elsevier, vol. 422(C).
  • Handle: RePEc:eee:ecomod:v:422:y:2020:i:c:s0304380019304351
    DOI: 10.1016/j.ecolmodel.2019.108927

    Download full text from publisher

    File URL:
    Download Restriction: Full text for ScienceDirect subscribers only

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    1. Heckman, James, 2013. "Sample selection bias as a specification error," Applied Econometrics, Publishing House "SINERGIA PRESS", vol. 31(3), pages 129-137.
    2. Kolstoe, Sonja & Cameron, Trudy Ann, 2017. "The Non-market Value of Birding Sites and the Marginal Value of Additional Species: Biodiversity in a Random Utility Model of Site Choice by eBird Members," Ecological Economics, Elsevier, vol. 137(C), pages 1-12.
    3. Peter J. Diggle & Raquel Menezes & Tingā€li Su, 2010. "Geostatistical inference under preferential sampling," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 59(2), pages 191-232, March.
    4. Michael J O Pocock & John C Tweddle & Joanna Savage & Lucy D Robinson & Helen E Roy, 2017. "The diversity and evolution of ecological and environmental citizen science," PLOS ONE, Public Library of Science, vol. 12(4), pages 1-17, April.
    5. D. Pati & B. J. Reich & D. B. Dunson, 2011. "Bayesian geostatistical modelling with informative sampling locations," Biometrika, Biometrika Trust, vol. 98(1), pages 35-48.
    6. Fiske, Ian & Chandler, Richard, 2011. "unmarked: An R Package for Fitting Hierarchical Models of Wildlife Occurrence and Abundance," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 43(i10).
    Full references (including those not matched with items on IDEAS)


    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ecomod:v:422:y:2020:i:c:s0304380019304351. See general information about how to correct material in RePEc.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: (Haili He). General contact details of provider: .

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service hosted by the Research Division of the Federal Reserve Bank of St. Louis . RePEc uses bibliographic data supplied by the respective publishers.