Overview of SDM methods : Biosecurity Commons Support Portal

The Biosecurity Commons currently provides 19 different algorithms across 4 different categories to run your species distribution model:

Profile models

These models only use occurrence data, and are based on the characterisation of the environmental conditions of locations associated with species presence. Profile models produce results that are most frequently used in Biosecurity risk mapping to represent abiotic suitability. Range Bagging is the most robust profile model while a Surface Range Envelope with the Quantile model argument set to zero is the most conservative.

Bioclim
Climatch - lite
Range Bagging
Surface Range Envelope

Defines a multi-dimensional environmental space bounded by the minimum and maximum values of environmental variables for all occurrences as the potential range where a species can occur.

A Surface Range Envelope with the Quantile model argument set to zero that uses the maximum temperature recorded in areas where a pest has been recorded will indicate any location within that range of maximum temperatures is equally likely to be suitable for the pest. If maximum temperatures where the pest has been observed range between 20 and 40 degrees Celsius all locations that have temperatures between those values will be mapped as suitable, with a score of one, while locations with temperatures less than 20 degrees or more than 40 will be given a suitability score of zero. If we had 100 locations where the pest was recorded and but only had one point with a maximum temperature as high as 40 degrees, with the rest no higher than 35 degrees, increasing the Quantile argument to 2.5 (95% confidence interval), then areas that reach temperatures over 35 degrees would be mapped as unsuitable. When using mutiple temperature and rainfal realted variable (i.e. 19 bioclim variables), a multivariate cloud is created around values of all variables at the user specified confidence interval. Other profile methods will present suitability as a range between zero and one, with the most common values (or best match) of, for example, maximum temperature recorded at presence locations being mapped as most suitable (close to 1) and less frequently recorded temperatures being mapped as less suitable (closer to zero). Which model to choose will depend on your understanding of a pests physiological tolerances.

Statistical regression models

These models produce estimates of the effect of different environmental variables on the distribution of a species. These models use all the data available to estimate the parameters of the environmental variables, and construct a function that best describes the effect of these predictors on species occurrence. The suitability of a particular model is often defined by specific model assumptions.

Flexible Discriminant Analysis	A classification model based on a mixture of linear regression models, which uses optimal scoring to transform the response variable so that the data are in a better form for linear separation, and multiple adaptive regression splines to generate the discriminant surface.
Generalised Linear Model	A regression model for data with non-normal distribution, fitted with maximum likelihood estimation.
Generalised Additive Model	A multiple regression model that uses smoothed functions of the environmental variables to model non-linear relationships between the response and the predictors.
Multivariate Adaptive Regression Splines	A regression model that builds multiple linear regression models across the range of predictor values by partitioning the data and run a linear regression model on each different partition. This allows to model complex relationships between the response and predictor variables.

Machine learning models

These models typically use one part of the dataset to ‘learn’ and describe the dataset (training) and the other part to assess the accuracy of the model.

Artificial Neural Network	A 'black box' model that predicts species occurrence probabilities as a weighted combination of features, which are calculated in a hidden layer form linear combinations of the predictor variables.
Boosted Regression Tree / General Boosting Model	Predicts species occurrence probabilities based on a combination of decision trees and boosting. It uses a stagewise procedure to iteratively fit random subsets of the data that are weighted in such a way that new trees take into account the error of previously built trees.
Classification Tree	Predicts species occurrence by repeatedly splitting the dataset into mutually exclusive groups based on a threshold value of one of the environmental variables.
Maxent	Predicts species occurrences by finding the distribution that is most spread out, or closest to uniform, while taking onto account the limits of the environmental variables of known locations.
Random Forest	Grows many decision trees based on random subsets of the data and averages the predictions of these trees to estimate the importance of each environmental variable.

Geographical models

These models only use the geographic location of known occurrences of a species to predict the likelihood of presence in other locations, and do not rely on the values of environmental variables.

Circles	Predicts that a species is present at sites within a certain radius around observed occurrences, and absent beyond that radius.
Convex Hull	Predicts that a species is present at sites inside the minimum spatial convex hull around observed occurrences, and absent outside that hull.
Geographical Distance	Predicts species occurrences based on the assumption that the closer to a known presence, the more likely it is to find the species.
Inverse-Distance Weighted Model	Predicts species occurrence probabilities for unknown locations as the average of values at the nearby known locations weighted by their inverse distance from the unknown location.
Voronoi Hull	Predicts that a species is present inside voronoi hulls around observed occurrences, which consist of all points whose distance to the known location is less than equal to its distance to any other known location, and absent outside those hulls.

Biosecurity Commons Support Portal

How can we help you today?

Overview of SDM methods Print

Profile models

Statistical regression models

Machine learning models

Geographical models

How can we help you today?

Overview of SDM methods Print

Profile models

Statistical regression models

Machine learning models

Geographical models

Related Articles