Estimating absence locations of marine species from data of scientific surveys in OBIS
Estimating absence locations of a species is important in conservation biology and conservation planning. For instance, using reliable absence as much as presence information, species distribution models can enhance their performance and produce more accurate predictions of the distribution of a species. Unfortunately, estimating reliable absence locations is difficult and often requires a deep knowledge of the species’ distribution and of its abiotic and biotic environmental preferences and tolerance. In this paper, we propose a methodology to reconstruct reliable absence information from presence-only information, and the conditions that those presence-only data have to meet to make this possible.
Large species occurrence data collections (otherwise called occurrence datasets) contain high quality and expert-reviewed species observation records from scientific surveys. These surveys can be used to retrieve species presence locations, but they also record places where the species in their target list were not observed. Although these absences could be simply due to sampling variation, it is possible to intersect many of these reports to estimate true absence locations, i.e. those due to habitat unsuitability or geographical hindrances. In this paper, we present a method to generate reliable absence locations of this type for marine species, using scientific surveys reports contained in the Ocean Biogeographic Information System (OBIS), an authoritative species occurrence dataset. Our method spatially aggregates information from surveys focussing on the same target species. It detects absence locations for a given species as those locations in which repeated surveys (that included the species of interest in their target list) reported information only on other species. We qualitatively demonstrate the reliability of our method using distribution records of the Atlantic cod as a case study. Additionally, we quantitatively estimate its performance using another authoritative large species occurrence dataset, the Global Biodiversity Information Facility (GBIF). We also demonstrate that our approach has higher accuracy and presents complementary behaviour with respect to another method using environmental envelopes. Our process can support species distribution models (as well as other types of models, e.g. climate change models) by providing reliable data to presence/absence approaches. It can manage regional as well as global scale scenarios and runs within a collaborative e-Infrastructure (D4Science) that publishes it as-a-Service, allowing biologists to reproduce, repeat and share experimental results.