Assessing the usefulness of citizen science data for habitat suitability modelling: Opportunistic reporting versus sampling based on a systematic protocol
Peer reviewed, Journal article
MetadataShow full item record
Aim: To evaluate the potential of models based on opportunistic reporting (OR) compared to models based on data from a systematic protocol (SP) for modelling species distributions. We compared model performance for eight forest bird species with contrasting spatial distributions, habitat requirements and rarity. Differences in the reporting of species were also assessed. Finally, we tested potential improvement of models when inferring high-quality absences from OR based on questionnaires sent to observers. Location: Both datasets cover the same large area (Sweden) and time period (2000–2013). Methods: Species distributions were modelled using logistic regression. Predictive performance of OR models to predict SP data was assessed based on AUC. We quanti-fied the congruence in spatial predictions using Spearman's rank correlation coefficient. We related these results to species characteristics and reporting behaviour of observers. We also assessed the gain in predictive performance of OR models by adding inferred absences. Finally, we investigated the potential impact of sampling bias in OR. Results: For all species, and despite the sampling biases, results from OR overall agreed well with those of SP, for the nationwide spatial congruence of habitat suitability maps and the selection and directions of species–environment relationships. The OR models also performed well in predicting the SP data. The predictive performance of the OR models increased with species rarity and even outperformed the SP model for the rarest species. No significant impact of observer behaviour was found. Main conclusions: Relatively simple analyses with inferred absences could produce reliable spatial predictions of habitat suitability. This was especially true for rare species. OR data should be seen as a complement to SP, as the weakness of one is the strength of the other, and OR may be especially useful at large spatial scales or where no systematic data collection protocols exist.