Carrascal Et Al-2015-Diversity and Distributions
Carrascal Et Al-2015-Diversity and Distributions
) (2015) 1–11
1
Department of Biogeography and Global ABSTRACT
Change, Museo Nacional de Ciencias
Aim Quantifying species abundances is costly, especially when many species are
Naturales, CSIC. C/ Jose Gutierrez Abascal
2, 28006 Madrid, Spain, 2Wildlife Consultor,
involved. To overcome this problem, several studies have predicted local abun-
C/ Candanch u 18, 28440 Guadarrama, dances (at the sample unit level) from species occurrence distribution models
Spain (SODMs), with differences in predictive performance among studies. Surprisingly,
the ability of SODM to predict regional abundances of an entire area of interest
has never been tested, despite the fact that it is an essential parameter for species
conservation and management. We tested whether local and regional abundances
of 21 terrestrial bird species could be predicted from SODMs in an exhaustively
surveyed island, and examined the variation explained by species-specific traits.
DOI: 10.1111/ddi.12368
ª 2015 John Wiley & Sons Ltd http://wileyonlinelibrary.com/journal/ddi 1
L. M. Carrascal et al.
other biodiversity components (Chapin et al., 2000). How- Krohn, 1999; Kadmon et al., 2003; Carrascal et al., 2006).
ever, quantifying species abundances is challenging because it For example, modelling success is inversely related to spatial
is costly in terms of time, and human and economic variability (mobility and nomadism) and niche breadth,
resources. In contrast, the number of studies analysing vari- although the observed patterns are not consistent across all
ables derived from species presences is becoming dispropor- biological groups (Pearce & Ferrier, 2000; Pearce et al.,
tionately higher than those making use of measures of 2001). Similar species-specific variations in modelling success
abundance (Guisan & Thuiller, 2005; Rodrıguez et al., 2007). have been found considering the positive effects of common-
To overcome this problem, several studies aimed to predict ness, abundance and detectability (Boone & Krohn, 1999;
species abundance from species occurrence distribution mod- Kadmon et al., 2003). Therefore, the analysis of the associa-
els (SODMs; e.g. Conlisk et al., 2009 and references therein). tion between species’ biological traits and model accuracy is
Thus, linking successfully distributional occurrence data with useful because if we know the effect of specific traits on
abundance through relevant factors should provide a useful modelling results, we can improve the sampling design for
tool because species presence data are easier to obtain, which multispecies studies (Seoane et al., 2005).
opens the possibility of coordinating volunteer programs in In this study, we examined whether local and regional
field survey designs. abundances of a group of terrestrial bird species can be pre-
However, the extent to which SODM outputs are able to dicted from SODMs in La Palma, a Macaronesian island in
precisely predict local abundances or densities remains con- the Canary archipelago. An exhaustive field survey was car-
troversial (Pearce & Ferrier, 2001; Nielsen et al., 2005; ried out to record presence/absence data and abundances of
Jimenez-Valverde et al., 2009; Estrada & Arroyo, 2012; Van twenty-one bird species throughout a representative sample
Couwenberghe et al., 2013; Bean et al., 2014; Thuiller et al., of transects encompassing the spatial and environmental
2014; Ya~ nez-Arenas et al., 2014; Russell et al., 2015). Despite range of the island. First, for each species, we built distribu-
potential limitations of SODMs to account for local abun- tion models for La Palma Island using presence/absence or
dances (Pearce & Ferrier, 2001; Nielsen et al., 2005), their presence/background data as a function of relevant environ-
ability to predict the total count of individuals in the whole mental predictors. Second, we compared the ability of these
study area (hereafter regional abundance) is unknown. When two types of models to predict the observed local abundances
the central limit theorem holds (Grinstead & Snell, 1997), of the studied bird species. Third, we used the type of
local overpredictions and underpredictions can be counter- SODM that derived better local predictions to obtain estima-
acted because they are randomly and equally distributed. In tions of regional abundances. For this purpose, SODM out-
this case, regional abundances could be accurately predicted puts were converted to abundances by means of a previously
even in cases of moderate ability of SODMs to predict local proposed and well-founded procedure in the early seventies,
abundances. the binomial sampling to estimate average densities (Gerrard
Among the SODM types, there are also differences regard- & Chiang, 1970). This conversion has been rarely applied for
ing the difficulty of obtaining distributional data, mainly organisms other than arthropods but merits further evalua-
depending on whether they makes use of presence/absence tion, because it does not require complex parameterizations.
or only true presences. Recording presence/absence data Our predictions of regional abundances were then evaluated
requires greater survey effort than presence-only data because using total number of birds recorded in the field. Fourth, we
uncertainties associated with absences are greater (Jimenez- performed an analysis including all species to elucidate spe-
Valverde et al., 2008). Moreover, part of the variability cies-specific traits that can potentially explain the interspeci-
regarding the ability of SODM to predict abundances might fic variation in the regional abundance estimations. To our
be influenced by whether or not true absences are assumed knowledge, this is the first time that the ability of SODM to
(Nielsen et al., 2005; VanDerWal et al., 2009). Therefore, predict species regional abundances has been examined.
elucidating the extent to which obtaining absence data merits
additional survey efforts needs to solve the trade-off between
METHODS
feasibility and effectiveness when predicting abundances from
SODMs. Comparisons between SODM outputs, considering
Study area
they include or not reliable absence data, may help to exam-
ine the variability in the relationships between probability/ The study area is located in La Palma (28420 N, 17500 W;
suitability values and abundance estimations. 706 km2) a young (1–2 Myr) oceanic island of the Canary
In the same way, species-specific traits linked with natural archipelago located 417 km from the African coast. It is a
history are also sources of variability in model accuracy to high island (2426 m a.s.l.), with extensive areas with annual
predict species’ distributions and abundances. This inter- precipitation higher than 600 mm, and with a widespread
specific variability limits the predictive power of modelling representation of native shrublands and pine and evergreen
exercises, a limitation that cannot be always overcome by ‘laurisilva’ forests (although natural cover has been much
mere statistical refinements (Seoane et al., 2005). Several reduced since humans occupied the islands: de Nascimento
studies have shown that ecological and natural history traits et al., 2009). A considerable proportion of island area below
of species may predict the errors in SODMs (Boone & 1100 m a.s.l. has been highly transformed by agricultural
2 Diversity and Distributions, 1–11, ª 2015 John Wiley & Sons Ltd
Regional abundances from occurrence data
Abundance estimations
Diversity and Distributions, 1–11, ª 2015 John Wiley & Sons Ltd 3
L. M. Carrascal et al.
topography. All these data were also obtained for the centre
Predicting local and regional abundances from
of all UTM 500 m 9 500 m squares of La Palma Island
occurrence distribution models
(n = 3263).
Firstly, we aimed to assess the general ability of presence/ab-
sence models (BCT) and presence/background absence mod-
Species occurrence distribution models
els (MaxEnt) to predict local abundances at the transect
Boosted classification trees (BCT) were employed to assess level. For this purpose, we estimated separately for each spe-
the probability of occurrence (presence-1/absence-0) of each cies the Pearson correlations of the relationships between
species in the sample of 437 transects of 0.5-km using the observed abundances in transects and SODM outputs (the
16 formerly mentioned predictor variables. The BCT algo- predicted habitat suitabilities using MaxEnt or probabilities
rithm builds a number of regression trees (typically hun- of occurrence using BCT). Sequential Bonferroni adjustment
dreds) in a stagewise fashion on randomly selected subsets was applied to these analyses to control for type I errors
of data and combines them to improve predictive perfor- (Benjamini & Hochberg, 1995). Then, we used a paired t-test
mance (see for details: De’ Ath, 2007; Elith et al., 2008). We to compare between the Pearson correlation coefficients
used a fivefold approach to test the accuracy of predictions obtained for each species separately with BCT outputs and
of BCT models. As outputs from boosting are not well cali- those obtained with MaxEnt outputs. In addition, we
brated, posterior probabilities predictions of BCT models assessed the degree of triangularity in the relationships
were calibrated applying a logit function to transform boost- between observed local abundances and model outputs sepa-
ing predictions with a sigmoid function (Niculescu-Mizil & rately for each species (see Appendix S1).
Caruana, 2005). As use–availability models, such as MaxEnt, are unable to
To compare BCT predictions with those provided when predict the probability of occurrence (Hastie & Fithian,
accurate absence data does not exist, occurrences for each 2013), BCT probabilities of occurrence were subsequently
species were also modelled using the MaxEnt algorithm used to obtain regional abundance estimations. For this pur-
(Phillips et al., 2006; Phillips & Dudik, 2008). We selected pose, we firstly converted the probabilities of occurrence to
this modelling technique because it is a widely used proce- bird numbers applying a procedure that has been shown to
dure when only presences are available, and is also a machine be appropriate for the case of outputs from presence/absence
learning method. As in the classic resource selection func- models. The predicted probabilities of occurrence for each
tions of use–availability designs (Manly et al., 2002), MaxEnt transect (Pi) derived from BCT models were converted to
generates suitability outputs from presence data and a pool predicted bird numbers for each species (ni0 ) using the fol-
of background absences selected at random from the study lowing expression under the assumption of random distribu-
area using a maximum entropy approach (Pearce & Boyce, tions with Poisson distributed populations (Gerrard &
2006; Phillips et al., 2006). In our case, these background Chiang, 1970; Gerrard & Cook, 1972) as follows:
absences were selected out of the UTM squares in which
ni0 = ln (1Pi)
transects occur and equal in numbers to those used in BCT
models for each species (range: 44–420; average = 335). This The summation of the predicted ni0 figures for each spe-
approach has been chosen to (1) avoid the use of true cies (∑ni0 ) was used to estimate its resemblance to the true
absences as background absences and (2) to ease the compar- number of birds counted in the whole sample (∑ni) of 437
ison of model outputs using an identical number of true transects that equal 218.5 km. These numbers were trans-
absences (in BCT) and background absences (in MaxEnt). formed in regional densities (DENREG; birds km2) consid-
Moreover, for MaxEnt, the fivefold data split into training ering the above-mentioned formula by J€arvinen & V€ais€anen
and testing subsets was the same as for the BCT models (1975). Finally, we performed a Pearson correlation to esti-
within each species. Thus, our data arrangement will enable mate the relationship between predicted and observed regio-
more direct inferences regarding the use of reliable absences nal densities for the 21 bird species recorded. Additionally,
in models while keeping other sources of intermodel variabil- we used t-tests to assess whether this predicted regression
ity as fixed as possible. line deviated significantly from the equality between the
The discrimination ability of BCT and MaxEnt models to observed and predicted densities.
predict each species’ distribution was compared through the
area under the curve (AUC) of the receiver operating charac-
Interspecific variation in prediction accuracy of
teristic (ROC) plot of sensitivity against 1-specificity (Field-
regional density
ing & Bell, 1997). AUC values should not be interpreted
uncritically, and one of the major misuses is relying on abso- Interspecific variation in the prediction accuracy of densities
lute values to compare among species with different preva- using BCT models was characterized by calculating the per-
lences (Lobo et al., 2008). In spite of this, its use in a centage difference between predicted and observed regional
relative way may be useful to compare among modelling densities in relation to observed regional density (hereafter
techniques within species with identical prevalences (Arag on % change). The thus obtained % change was then related to
& Sanchez-Fernandez, 2013). several autoecological traits of the species: species prevalence
4 Diversity and Distributions, 1–11, ª 2015 John Wiley & Sons Ltd
Regional abundances from occurrence data
in the whole sample of transects (range: 0.04–0.90), coeffi- were highly correlated with those predicted by BCT occur-
cient of variation in bird numbers when each species was rence probabilities (Pi) when converted to regional densities
present (30–167%), a surrogate of detectability (as measured (i.e. ∑ln[1Pi] for transects i = 1 to i = 437; r = 0.967;
by the ratio p of main belt to total belt observations of each P 0.001; Table 1; Fig. 2). Coefficients a and b in the
bird species – see above; range: 0.22–0.89), body mass (5.8– equation OBSERVED = a + bPREDICTED did not signifi-
480 g; obtained from Perrins, 1998 as the mean weight of cantly differ from zero and one, respectively (a = 3.5,
males and females, or as the average value of body weight SE = 3.25; b = 1.014, SE = 0.061; P > 0.2 in both t-tests;
range in spring and summer), and habitat breadth (0.14– Fig. 2). Therefore, the observed and predicted regional densi-
0.74) and ecological density (3.5–248.1 birds km2) esti- ties are operatively interchangeable. Moreover, % of differ-
mated for the most preferred habitat (these two last variables ence between predicted and observed regional density was
obtained from Appendix B of Seoane et al., 2011). close to zero (mean % difference = 0.076, SE = 7.97).
All possible subsets of the predictors using general linear Thus, there was no overall bias towards either overestimation
models were estimated (64 models) and were compared with or underestimation of bird abundance at the regional level.
second-order AIC corrected for small sample sizes (AICc;
Burnham & Anderson, 2002) to assess their weights of evi-
Interspecific variation in prediction accuracy of
dence. The strength of evidence of models was obtained
regional density
using weights (Wi) derived from AICc figures, using all pos-
sible models (R package glmulti). Parameter estimates (stan- Interspecific variation in the accuracy of predicted average
dardized regression coefficients, b; R2 of models) were density at regional scale (i.e. the average density in the whole
averaged using model weights (Wi; Arnold, 2010). sample of transects) was explained to a great amount (73%
of variance) by a weighted average model. The variability in
bird counts when the species was present, habitat breadth,
RESULTS
prevalence in the sample of transects and regional maximum
density were the most influential variables (ΣWi ≥ 0.4;
Accuracy of species distribution models
Table 2). The variable most affecting the accuracy of pre-
As AUC values were obtained by fivefold cross-validation, dicted regional density was the variability in bird counts
predictions of bird distributions from both BCT and MaxEnt measured by the coefficient of variation (CV%; ΣWi = 1,
models can be considered excellent or good according to with the largest absolute value of the standardized regression
usual performance criteria (Swets 1988) (n = 21; mean coefficient; Fig. 3a). Habitat breadth had also a similarly high
AUCs SD: BCT = 0.835 0.118; MaxEnt = 0.792 importance, although its magnitude effect was lower (b coef-
0.128; Table 1). AUCs for BCT and MaxEnt models were ficients in Table 2; see Fig. 3b). Summarizing, predicted
significantly and positively correlated (Pearson’s correlation: regional abundance tended to be underestimated in those
r = 0.693; P = 0.0005) although BCT figures were slightly species which occupy a narrow range of habitats and show a
higher than those obtained with MaxEnt (paired large variability in numbers when present. High prevalence
t-test = 2.073, P = 0.051, Table 1). in the sample and high density in the most preferred habitat
also tended to underestimate regional estimates.
Predicting bird local and regional abundances from
distribution models DISCUSSION
Probabilities of occurrence (from BCT) and habitat suitabil- In this study, we examined the extent to which the continu-
ity values (from MaxEnt) were positively and significantly ous predictions obtained from species’ presence/absence
associated with their corresponding observed abundances for (probabilities of occurrence from boosted classification trees)
nearly all species using transects as sample units (see Pear- or presence/background (suitabilities from MaxEnt) models
son’s correlation coefficients in Table 1). The exceptions can predict species abundances, either at local (sampling
were Phylloscopus canariensis and Streptopelia turtur for Max- units) or at a regional level (La Palma Island). To allow
Ent outputs, where relationships with abundance were not comparisons between presence/absence and presences/back-
significant after sequential Bonferroni corrections. The ground models, prevalences and fivefold partitions were kept
strength of association between model predictions and identical in both modelling procedures for each species,
observed abundances was considerably higher for BCT than while the only difference was the use of observed absences
for MaxEnt models (paired t-test = 10.792; P < 0.001; vs. background random data. Although the accuracy of pres-
n = 21 species). On the other hand, the triangular relation- ence/absence models was only slightly higher than that of
ship assessed with quantile regressions was always present presence/background models in predicting the occurrence of
and was not different between BCT and MaxEnt results (see species, the ability to predict observed local abundances was
Appendix S1). clearly superior for presence/absence models using BCT. As
At the regional level (i.e. using the whole sample of tran- we designed an experimental protocol to rule out the influ-
sects in the island), the observed average densities per species ence of differences in the prevalence of training data, our
Diversity and Distributions, 1–11, ª 2015 John Wiley & Sons Ltd 5
L. M. Carrascal et al.
Table 1 Summary of model results for 21 bird species in La Palma Island (Canary Islands, Spain).
Species MaxEnt AUC BCT AUC r MaxEnt r BCT DENREG pred DENREG est % change
MaxEnt AUC, AUC values from MaxEnt models; BCT AUC, AUC values from boosted classification tree models; r MaxEnt, correlation coeffi-
cients from Pearson’s correlations between MaxEnt outputs and estimated specie’s local abundances; r BCT, coefficients from Pearson’s correla-
tions between BCT outputs and estimated specie’s local abundances (significant correlations at P < 0.05 after sequential Bonferroni correction are
shown in bold type); DENREG pred, average regional density (birds km2) predicted from transformed BCT probabilities in all transects; DEN-
REG est, estimated regional density (birds km2) derived from all transects; % change, % difference between predicted and estimated regional
densities in relation to estimated regional density. Predictions were obtained from fivefold cross-validations. Data on species presences/absences
were obtained from 437 transects covering all habitats of the island.
6 Diversity and Distributions, 1–11, ª 2015 John Wiley & Sons Ltd
Regional abundances from occurrence data
Table 2 Alternative models for interspecific variation in large-scale prediction accuracy of bird density in 21 species inhabiting La
Palma Island (Canary Islands, Spain). Accuracy is measured as the percentage of variation of predicted average densities with respect to
estimated average densities of birds in the whole sample of the 437 0.5-km line transects (see % change in Table 1). Only models with
DAICc < 2 are shown for brevity. Multimodel inference (lower part of the table) has been obtained considering all the possible
combinations of predictors (64 models), averaging the results according model weights (Wi). Figures for each variable are standardized
regression coefficients (b) obtained in general linear models. For each variable, ΣWi is the sum of weights of the models in which the
variable appears, weighted average b is the weighted average of standardized regression coefficients and se b the unconditional standard
errors.
Large-scale accuracy
Model 1 0.718 0.280 0.351 75.0 0.166 194.6
Model 2 0.450 0.726 0.463 74.9 0.155 194.7
Model 3 0.607 0.674 0.568 0.224 78.6 0.111 195.3
Model 4 0.765 0.242 68.5 0.081 196.0
Model 5 0.792 62.7 0.066 196.4
Multimodel inference
ΣWi 0.438 1.000 0.185 0.632 0.483 0.268
Weighted average b 0.150 0.737 0.019 0.246 0.165 0.046 72.7
SE b 0.265 0.139 0.054 0.242 0.212 0.086
AICc, AIC corrected for small sample sizes; R2, variance explained by each model (in %); CV%, coefficient of variation in bird numbers in tran-
sects where each species occurred; HB, habitat breadth considering 11 different habitats; PREV, prevalence of each species in the sample of 437
0.5-km line transects; DETECT, ratio of main belt (25 m) to total belt observations of each bird species (larger figures correspond to less detect-
able species); MASS, body mass of species (in log); DMAX, maximum density recorded in 11 different habitats. See Appendix S2 for more details
on species characteristics. Models 1–5 are highly significant (P < 0.001) using the classical frequentist approach.
Although several studies have focused on the ability of that studies on reserve design selection are often based on
SODMs to predict local abundances (Nielsen et al., 2005; species representation (Araujo et al., 2007), analogous proce-
Seoane et al., 2005; Jimenez-Valverde et al., 2009; Estrada & dures based on probabilities of occurrence are scarce (Cabeza
Arroyo, 2012; Bean et al., 2014; Thuiller et al., 2014; Ya~ nez- et al., 2004), and there is a general lack of approaches deal-
Arenas et al., 2014; Russell et al., 2015), showing generally ing with abundances in many organisms (apart from birds,
that they only allow for the demarcation of the upper limit considering their attractiveness for citizen science projects).
of the observed abundances (e.g. VanDerWal et al., 2009; The high accuracy of the procedure used here to predict
T^orres et al., 2012), little is known about the usefulness of regional densities from SODM outputs with true presence/
occurrence data to predict regional abundances (i.e. number absences suggests its potential value when working with
of individuals or densities). The advantage of the approach organisms for which census programs dealing with abun-
applied here at the regional level is that the same transforma- dances are not the norm or are not feasible.
tion is applied for all species, and hence, it can be used as an At the regional scale, we found that the interspecific vari-
alternative to specific parameterizations proposed in other ation in prediction accuracy of regional abundance can be
studies for each species separately (e.g. VanDerWal et al., explained by species-specific traits related to distribution
2009). We propose that this procedure is especially appropri- patterns and habitat preferences. This is in line with previ-
ate and cost-effective when the aim is to infer regional abun- ous studies showing that autoecological traits may affect
dances of large sets of species under sampling restrictions, as model performance in predicting species distributions from
often occur in biodiversity studies. Thus, our procedure to observed presences/absences (Hernandez et al., 2006), abun-
predict average regional densities can be a powerful tool in dances from observed abundances (Seoane et al., 2005; Car-
cases of biodiversity assessment in poorly known regions or rascal et al., 2006) and abundances from occurrence
remote areas. Furthermore, we may be interested in examin- probabilities (Nielsen et al., 2005; Jimenez-Valverde et al.,
ing the potential effect of an ecological perturbation by com- 2009; Estrada & Arroyo, 2012; Russell et al., 2015). Habitat
paring species abundances in the target area before and after breath and the coefficient of variation in bird numbers were
the perturbation occurred, or between the disturbed and specific traits with higher relative importance in explaining
other neighbouring areas. In the same vein, this procedure the interspecific variation in predicting regional densities.
can provide insights in the context of reserve design; com- Bird species with a greater habitat breadth, such as Falco
paring predicted regional densities among contiguous areas tinnunculus and S. turtur, tended to be overestimated
with different protection status would help to make decisions (Fig. 3b, see Appendix S2). Species inhabiting a greater
when reviewing their protection capacity. It is remarkable number of habitat types can be associated with a greater
Diversity and Distributions, 1–11, ª 2015 John Wiley & Sons Ltd 7
L. M. Carrascal et al.
20
structure, food availability, substrata for nesting). Estrada &
Arroyo (2012) found that differences between two harrier
0 species regarding the degree of association between SODM
outputs and abundances could be explained by the degree of
–20 gregariousness and by the interspecific variation in the use of
–40
social information for site selection. Thus, it is possible that
the within- and among-species variation in grouping beha-
–60 viour affects abundance predictions intra- and interspecifi-
20 40 60 80 100 120 140 160 180
cally. Finally, our results show that among the species traits
Coefficient of variation in bird numbers (%)
considered, detectability had the lowest relative importance
(b) 80 in explaining deviations from the observed regional density.
In fact, it has been argued that presence/absence models are
60 less affected by this trait than models built with presence-
Partial residuals
40
only data (Pearce & Ferrier, 2001).
To conclude, our results show that when predicting species
20 abundances from occurrence data, presence/absence models
outperformed presence/background models. If abundance or
0
density information is essential to advise conservation deci-
–20 sions, such information should not be derived when reliable
absences are lacking. The use of presence-only models with
–40 background data does not allow good predictions of local
0.0 0.2 0.4 0.6 0.8 1.0
abundances. Moreover, the impossibility of estimating the
Habitat breadth
probability of occurrence from these presence-only designs
Figure 3 Partial residual plots illustrating the influence of the (Hastie & Fithian, 2013) hinders the estimation of abundances
coefficient of variation in bird numbers where they occurred (a) by the conversion of probabilities to animal numbers. Our
and habitat breadth (b) on the accuracy of predicted average study shows that despite limitations of occurrence binary data
regional densities measured as the percentage difference between (presence/absence) to predict precise local abundances, these
predicted and estimated regional density respect to estimated local predictions may be combined to predict unbiased aver-
regional density (% change in Table 1). N = 21 bird species from age regional abundance. This is because, although accuracies
La Palma Island (Canary Islands, Spain). Residual plots show the
are not similar across species, overestimations and underesti-
relationship between a given independent variable and the
mations compensate each other within each species.
response given that the other independent variables in Table 2 are
also in the model, therefore partialling out their effects. It is highly surprising that the procedure revisited here
designed by Gerrard & Chiang (1970) to convert local proba-
range of environmental variation, and hence, predictions bilities of occurrence into numbers of individuals has rarely
might be closer to the upper part of their potential. It is been used with vertebrates (but see Tellerıa & Saez-Royuela,
also plausible that species with broad niches are at lower 1986), considering that the accuracy of the predictions is
numbers than the expected potential simply because other very high as it has been demonstrated in this study and pre-
biologically relevant factors not included in the models viously with arthropods (e.g. Gerrard & Chiang, 1970;
might be also shaping subtle variations in their abundances. Badenhausser et al., 2007; Hall et al., 2007). The only con-
Thus, species with larger habitat breadths may be more sen- cern is to avoid the ‘dangerous zone’ where the probability
sitive to the exclusion of unknown relevant factors in mod- of occurrence (Pi) is higher than ca. 0.9. Over this probabil-
els, which result in a greater mismatch between observed ity, the observed and predicted abundances grow exponen-
and predicted abundances. Whatever the processes involved, tially, so very small changes in Pi generate very large
it appears that environmental tolerance governs both species variations in abundance. Therefore, the obvious advice is to
occurrence distributions and abundances, because it has define sampling protocols where the size of the sampling
been shown to affect the accuracy of SODM and abundance unit (i.e. 0.5-km length transects in our study) produces
models (Seoane et al., 2005; Carrascal et al., 2006; Hernan- probabilities or frequencies of occurrence below the ‘satura-
dez et al., 2006). tion point’ of 0.9 (see also Gerrard & Chiang, 1970). Further
Species with higher coefficients of variation of local abun- studies with heterogeneous taxa, scales and situations will
dance when present, such as Pyrrhocorax pyrrhocorax, Cardu- likely reinforce the generality of this procedure.
elis cannabina and Columba livia (e.g. from 1 to 30 Although obtaining good species’ absences in a random
individuals as opposed to ranges of 1–3 individuals), tended sampling protocol is economically costly and time-consum-
to be underestimated. The coefficient of variation may be ing, the costs associated with measure species’ abundances
8 Diversity and Distributions, 1–11, ª 2015 John Wiley & Sons Ltd
Regional abundances from occurrence data
are considerably higher and not always feasible. This study Burnham, K.P. & Anderson, D.R. (2002) Model selection and
highlights the usefulness of surrogate measures of species multimodel inference: a practical information–theoretic
abundances derived from distribution models built with approach, 2nd edn. Springer-Verlag, New York.
presence/absence data. This approach can be a useful tool in Cabeza, M., Ara ujo, M.B., Wilson, R.J., Thomas, C.D., Cow-
applied ecology, especially when working in remote areas, ley, M.J.R. & Moilanen, A. (2004) Combining probabilities
under budget restrictions or with limited qualified personnel. of occurrence with spatial reserve design. Journal of Applied
As the accuracies of predicted regional densities are similar Ecology, 41, 252–262.
across species, the approach is highly valuable in studies of Carrascal, L.M., Seoane, J., Palomino, D., Alonso, C.L. &
biodiversity that deal with a large number of species. More- Lobo, J.M. (2006) Species-specific features affect the ability
over, analyses testing the potential influence of species-speci- of census derived models to map winter avian distribution.
fic traits on prediction accuracy should be viewed as a Ecological Research, 21, 681–691.
valuable complement to gather further insights on the pro- Chapin, F.S. III, Zavaleta, E.S., Eviner, V.T., Naylor, R.L., Vi-
cesses involved in the interaction between the sampling tousek, P.M., Reynolds, H.L., Hooper, D.U., Lavorel, S.,
method and focus species. Sala, O.E., Hobbie, S.E., Mack, M.C. & Dıaz, S. (2000)
Consequences of changing biodiversity. Nature, 405, 234–
ACKNOWLEDGEMENTS 242.
Conlisk, E., Conlisk, J., Enquist, B., Thompson, J. & Harte, J.
This study is a contribution to the projects CGL2011-28177/ (2009) Improved abundance prediction from presence–ab-
BOS and CGL2014-56416-P of the Spanish Ministry of Edu- sence data. Global Ecology and Biogeography, 18, 1–10.
cation and Science and Spanish Ministry of Economy and De’ Ath, G. (2007) Boosted trees for ecological modeling and
Competitiveness, respectively. P.A. was supported by a prediction. Ecology and Society, 88, 243–251.
‘Ram on y Cajal’ contract (RYC-2011-07670) from the Span- Del Arco, M., Wildpret, W., de Perez Paz, P.L., Rodrıguez,
ish Ministry of Economy and Competitiveness. We thank A. O., Acebes, J.R., Garcıa, A., Martın, V.E., Reyes, J.A., Salas,
Jimenez-Valverde for helpful comments on the subject and M., Dıaz, M.A., Bermejo, J.A., Gonzalez, R., Cabrera, M.V.
C. Jasinski for improving the English of the manuscript. & Garcıa, S. (2003) Cartografıa 1:25.000 de la vegetacion
canaria. GRAFCANS. A., Santa Cruz de Tenerife.
Elith, J., Leathwick, J. & Hastie, T. (2008) A working guide
REFERENCES
to boosted regression trees. Journal of Animal Ecology, 77,
Aragon, P. & Sanchez-Fernandez, D. (2013) Can we disen- 802–813.
tangle predator-prey interactions from species distributions Estes, J.A., Tinker, M.T., Williams, T.M. & Doak, D.F.
at a macro-scale? A case study with a raptor species. Oikos, (1998) Killer whale predation on sea otters linking oceanic
122, 64–72. and nearshore ecosystems. Science, 282, 473–476.
Araujo, M.B., Lobo, J.B. & Moreno, J.C. (2007) The effec- Estrada, A. & Arroyo, B. (2012) Occurrence vs abundance
tiveness of Iberian protected areas in conserving terrestrial models: differences between species with varying aggrega-
biodiversity. Conservation Biology, 21, 1423–1432. tion patterns. Biological Conservation, 152, 37–45.
Arnold, T.W. (2010) Uninformative parameters and model Fernandez-Palacios, J.M. & Martın-Esquivel, J.L. (2001) Nat-
selection using Akaike’s Information Criterion. Journal of uraleza de las Islas Canarias: ecologıa y conservacion. Tur-
Wildlife Management, 74, 1175–1178. quesa, Santa Cruz de Tenerife.
Badenhausser, I., Amouroux, P. & Bretagnolle, V. (2007) Es- Fielding, A.H. & Bell, J.F. (1997) A review of methods for the
timating acridid densities in grassland habitats: a compar- assessment of prediction errors in conservation presence/
ison between presence–absence and abundance sampling absence models. Environmental Conservation, 24, 38–49.
designs. Environmental Entomology, 36, 1494–1503. Gerrard, D.J. & Chiang, H.C. (1970) Density estimation of
Bean, W.T., Prugh, L.R., Stafford, R., Butterfield, H.S., West- corn rootworm egg populations based upon frequency of
phal, M. & Brashares, J.S. (2014) Species distribution mod- occurrence. Ecology, 51, 235–243.
els of an endangered rodent offer conflicting measures of Gerrard, D.J. & Cook, R.D. (1972) Inverse binomial sam-
habitat quality at multiple scales. Journal of Applied Ecology, pling as a basis for estimating negative binomial popula-
51, 1116–1125. tion densities. Biometrics, 28, 971–980.
Benjamini, Y. & Hochberg, Y. (1995) Controlling the false Grinstead, C.M. & Snell, J.L. (1997) Central limit theorem.
discovery rate: a practical and powerful approach to Introduction to probability, 2nd edn (ed. by C. M. Grinstead
multiple testing. Journal of the Royal Statistical Society Ser- and J. L. Snell), pp. 325–360. American Mathematical Soci-
ies B, 57, 289–300. ety, Providence, Rhode Island.
Bibby, C.J., Burgess, N.D., Hill, D.A. & Mustoe, S.H. (2000) Guisan, A. & Thuiller, W. (2005) Predicting species distribu-
Bird census techniques, 2nd edn. Academic Press, London. tion: offering more than simple habitat models. Ecology
Boone, R.B. & Krohn, W.B. (1999) Modeling the occurrence Letters, 8, 993–1009.
of bird species: are the errors predictable? Ecological Appli- Hall, D.G., Childers, C.C. & Eger, J.E. (2007) Binomial sam-
cations, 9, 835–848. pling to estimate rust mite (Acari: Eriophyidae) densities
Diversity and Distributions, 1–11, ª 2015 John Wiley & Sons Ltd 9
L. M. Carrascal et al.
on orange fruit. Journal of Economic Entomology, 100, 233– Pearce, J. & Ferrier, S. (2000) An evaluation of alternative
240. algorithms for fitting species distribution models using
Hastie, T. & Fithian, W. (2013) Inference from presence-only logistic regression. Ecological Modelling, 128, 127–147.
data; the ongoing controversy. Ecography, 36, 864–867. Pearce, J. & Ferrier, S. (2001) The practical value of modelling
Hernandez, P.A., Graham, C.H., Master, L.L. & Albert, D.L. relative abundance of species for regional conservation plan-
(2006) The effect of sample size and species characteristics ning: a case study. Biological Conservation, 98, 33–43.
on performance of different species distribution modeling Pearce, J., Ferrier, S. & Scotts, D. (2001) An evaluation of
methods. Ecography, 29, 773–785. the predictive performance of distributional models for
J€arvinen, O. (1978) Species-specific census efficiency in line flora and fauna in northeast New South Wales. Journal of
transects. Ornis Scandinavica, 9, 164–167. Environmental Management, 62, 171–184.
J€arvinen, O. & V€ais€anen, R.A. (1975) Estimating relative den- Perrins, C. (1998) The complete birds of the Western Palaearc-
sities of breeding birds by line transect method. Oikos, 26, tic on CD-ROM. Oxford University Press, Oxford.
316–322. Phillips, S.J. & Dudik, M. (2008) Modeling of species distri-
Jimenez-Valverde, A., Lobo, J.M. & Hortal, J. (2008) Not as butions with Maxent: new extensions and a comprehensive
good as they seem: the importance of concepts in species evaluation. Ecography, 3, 161–175.
distribution modelling. Diversity and Distributions, 14, Phillips, S.J., Anderson, R.P. & Schapire, R.E. (2006) Maxi-
885–890. mum entropy modelling of species geographic distribu-
Jimenez-Valverde, A., Diniz, F., de Azevedo, E.B. & Borges, tions. Ecological Modelling, 190, 231–259.
P.A.V. (2009) Species distribution models do not account Rodrıguez, J.P., Brotons, L., Bustamante, J. & Seoane, J.
for abundance: the case of arthropods on Terceira Island. (2007) The application of predictive modelling of species
Annales Zoologici Fennici, 46, 451–464. distribution to biodiversity conservation. Diversity and
Juan, C., Emerson, B.C., Oromı, P. & Hewitt, G.M. (2000) Distributions, 13, 243–251.
Colonization and diversification: towards a phylogeo- Russell, D.J.F., Wanless, S., Collingham, Y.C., Anderson, B.J.,
graphic synthesis for the Canary Islands. Trends in Ecology Beale, C., Reid, J.B., Huntley, B. & Hamer, K.C. (2015)
and Evolution, 15, 104–109. Beyond climate envelopes: bio-climate modelling accords
Kadmon, R., Farber, O. & Danin, A. (2003) A systematic with observed 25-year changes in seabird populations of the
analysis of factors affecting the performance of climatic British Isles. Diversity and Distributions, 21, 211–222.
envelope models. Ecological Applications, 13, 853–867. Seoane, J., Carrascal, L.M., Alonso, C.L. & Palomino, D.
Lobo, J.M., Jimenez-Valverde, A. & Real, R. (2008) AUC: a mis- (2005) Species-specific traits associated to prediction errors
leading measure of the performance of predictive distribution in bird habitat suitability modelling. Ecological Modelling,
models. Global Ecology and Biogeography, 17, 145–151. 185, 299–308.
Manly, B.F.J., McDonald, L.L., Thomas, D.L., McDonald, Seoane, J., Carrascal, L.M. & Palomino, D. (2011) Assessing
T.L. & Erickson, W.P. (2002) Resource selection by animals: the ecological basis of conservation priority lists for bird
statistical design and analysis for field studies. Kluwer Aca- species in an island scenario. Journal for Nature Conserva-
demic Publishes, Dordrecht. tion, 19, 103–115.
Mcfarland, T.M., Van Riper, C., III & Johnsona, G.E. (2012) Swets, J.A. (1988) Measuring the accuracy of diagnostic
Evaluation of NDVI to assess avian abundance and rich- systems. Science, 240, 1285–1293.
ness along the upper San Pedro River. Journal of Arid Envi- Tellerıa, J.L. & Saez-Royuela, C. (1986) The use of the fre-
ronments, 77, 45–53. quency in the study of large mammals abundance. Acta
de Nascimento, L., Willis, K.J., Fernandez-Palacios, J.M., Cri- Oecologica, 7, 69–75.
ado, C. & Whittaker, R.J. (2009) The long-term ecology of Thuiller, W., M€ unkem€ uller, T., Schiffers, K.H., Georges, D.,
the forest of La Laguna, Tenerife (Canary Islands). Journal Dullinger, S., Eckhart, V.C., Edwards, T.C., Gravel, J.D.,
of Biogeography, 36, 499–514. Kunstler, G., Merow, C., Moore, K., Piedallu, C., Vissault,
Niculescu-Mizil, A. & Caruana, R. (2005) Obtaining Cali- S., Zimmermann, N.E., Zurrell, D. & Schurr, F.M. (2014)
brated Probabilities from Boosting. Proc. 21st Conference Does probability of occurrence relate to population
on Uncertainty in Artificial Intelligence AUAI Pres. http:// dynamics? Ecography, 37, 1155–1166.
arxiv.org/ftp/arxiv/papers/1207/1207.1403.pdf (accessed 09 T^
orres, N.M., De Marco, P., Santos, T., Silveira, L., de
August 2015). Almeida Jacomo, A.T. & Diniz-Filho, J.A.F. (2012) Can
Nielsen, S.E., Johnson, C.J., Heard, D.C. & Boyce, M.S. species distribution modelling provide estimates of popula-
(2005) Can models of presence–absence be used to scale tion densities? A case study with jaguars in the Neotropics.
abundance? Two case studies considering extremes in life Diversity and Distributions, 18, 615–627.
history. Ecography, 28, 197–208. Van Couwenberghe, R., Collet, C., Pierrat, J.C., Verheyen, K.
Pearce, J.L. & Boyce, M.S. (2006) Modelling distribution and & Gegout, J.C. (2013) Can species distribution models be
abundance with presence-only data. Journal of Applied used to describe plant abundance patterns? Ecography, 36,
Ecology, 43, 405–412. 665–674.
10 Diversity and Distributions, 1–11, ª 2015 John Wiley & Sons Ltd
Regional abundances from occurrence data
VanDerWal, J., Shoo, L.P., Johnson, C.N. & Williams, S.E. Appendix S2 Species-specific characteristics describing the
(2009) Abundance and the environmental niche: environ- distribution-abundance patterns of bird species.
mental suitability estimated from niche models predicts the
upper limit of local abundance. The American Naturalist,
BIOSKETCH
174, 282–291.
Yamamoto, N., Yokoyama, J. & Kawata, M. (2007) Relative Luis M. Carrascal is a research professor at the Museo
resource abundance explains butterfly biodiversity in island Nacional de Ciencias Naturales (CSIC, Spain). His current
communities. Proceedings of the National Academy of research interests are focused on macroecology, the biogeo-
Sciences USA, 104, 10524–10529. graphical ecology of the avifauna of the south-western
Ya~
nez-Arenas, C., Guevara, S., Martınez-Meyer, E., Mandu- Palaearctic and on the study of habitat selection in birds for
jano, S. & Lobo, J.M. (2014) Predicting species’ abun- modelling patterns of species abundance/occurrence.
dances from occurrence data: effects of sample size and
bias. Ecological Modelling, 294, 36–41. Author contributions: L.M.C., P.A., D.P. and J.M.L. con-
ceived the ideas; L.M.C. and D.P. collected the field data;
J.M.L. processed GIS data; L.M.C. and P.A. analysed the
SUPPORTING INFORMATION data; and L.M.C. and P.A. led the writing.
Additional Supporting Information may be found in the
online version of this article:
Editor: Mark Robertson
Appendix S1 Degree of triangularity in the relationships
between local observed and predicted abundances.
Diversity and Distributions, 1–11, ª 2015 John Wiley & Sons Ltd 11