## What is dispersal modelling?

In a biosecurity context, dispersal modelling involves the development of computerised simulations of the movement or spread of invasive species. This involves telling the model where your population starts and for each time step how much the population spreads and or grows. This simulation might run 100 or 1,000 replicates (you decide how many time steps and how long each time step is, 1 day, 1 year etc.).

An organism that is a biosecurity risk may spread through Australia using a variety of vectors which include things like wind, water, animals, gravity, or humans.

A dispersal model can be useful to predict the likely spread of an invasive species for several different purposes. First, it can also be used to predict the most likely locations for an already established invasive species to spread from known locations. Second, if using a precautionary approach to optimise surveillance for a pest that has not arrived, dispersal modelling can build on your risk map to highlight where pests are most likely to spread if they arrive. Dispersal modelling for either purpose can be useful to understand the full impact a biosecurity threat might have in the future, and how long it might take to spread to the point of having significant economic and environmental impacts.

Dispersal modelling to predict where pests might spread can be done to prioritise surveillance or to make a business case for local eradication efforts. It can also be very useful to help make a better estimate as to where a reported pest incursion might have already spread. In this article, we cover the kinds of dispersal models available, highlight a variety of references to learn more, and review the practical steps that can be taken on Biosecurity Commons to simulate the spread of a pest or disease. We ask readers to remember, as with any model, that high-quality data and understanding will yield a higher-quality dispersal map with greater precision and accuracy.

Dispersal models can be used to create visual outputs and results regarding the potential spread of invasive species, which can then be used in a framework for biosecurity surveillance, monitoring and management.

An excellent overview of biosecurity workflows is provided by Bradhurst *et al.* 2021, and a good review of dispersal and population models if available in Jongejans *et al. *2008.

## Types of dispersal models

**Diffusion**

**Diffusion**

Perhaps the simplest model of a pest’s dispersal or spread is a diffusion model where you start at a point, and the species spreads, grows or moves in all directions at the same rate. If starting from a point location where one pest or a group of pests are located, then at our first time period (t_{1}) the species has spread evenly out from that central point forming a circular edge of invasion. Let's say the species spread 100m in one month so the velocity of spread is 100m per month and at t_{1} (first month) the species will have spread 100m in all directions from the starting location, at t_{2} (second month) the species will have spread 200m, and at t_{3} (after three months) the species will have spread 300m. This constant velocity of circular spread might be called radial linear expansion.

A.

B.

*Figure 1.** A = A simple diffusion model where an organism spreads at the same rate from a point and equally in all directions. B = the uneven spread of muskrat in France from **Andow et al. 1990**.*

Invasion fronts are rarely circular, and well perfect circles don’t tend to exist in nature. There are all kinds of reasons a species might spread out faster or slower depending on local conditions in the areas it is trying to spread. For example, changing elevation, variation in temperature, spatial variation in population growth rate etc. are just a few of the reasons spread may speed up or slow down in different areas. Further, there are often complete barriers to the spread of an organism that may be impossible to get around or that may significantly slow the spread. Again, the population might reproduce more rapidly in some locations than others which then might cause different rates of spread into adjacent areas (see population models below). Trying to anticipate the different rates of spread and the resulting shape of the invasion front is the challenge.

**Gravity models**** **

A Gravity model in its simplest form requires two or more discreet locations and the model measures the flow of pests between these locations (nodes) over time. The rate of flow is generally captured as the increase in the population of pests over time (from t_{1} to t_{2} to t_{3 }etc.) at each node based on the movement between nodes. The rate of population increase at any node is usually captured by one of three ways. First, in most gravity models the assumption is that the flow of individuals between nodes will decrease as distance. Second, another possible assumption is that nodes with large populations will export more individuals to surrounding nodes. Third, one might assume flow is regulated by both population size and distance between nodes. This is done by taking the product (multiplying) the populations at two nodes and then dividing that product by the distance between those two nodes. Nodes might be patches of habitat, farms, or National Park trailheads. Additional complexity can be added to these basic models.

*Figure 3**. A simple depiction of a Gravity Model where the volume of flow of individual pests between locations (circles) is often based on the size of the population at each location and/or the distance between any two destinations. *

** **

**Dispersal kernels**

A dispersal kernel is used to determine how far each individual (usually) might disperse from a specific location during the time period that you specify in your model. For example, when your model starts (initialised) you may have 10 individuals in one location, and you may specify that 50% of those individuals are going to disperse in the next time period. Let us imagine the time period is one year long, and our invasive species are rabbits. How far will 5 rabbits disperse by the end of the first time period (t_{1})? That distance is going to vary depending on the rabbit, but we read that “rabbits usually do not travel vast distances, but movements in excess of 20 km have occasionally been recorded”. [Note: there might be better? Or other sources]. We want our model to randomly select how far each of our five rabbits will travel, but with the understanding that many of the rabbits will not disperse too far, and rarely the odd rabbit might go 20km at most. To represent that probability, you will need to choose a probability density function that matches the expected distance that say 100 rabbits might travel. One out of 100 might go 20km, but most will only go say 1 to 3 kilometres. There is a pretty good chance that if we ran our simulation model once our five rabbits would be predicted to disperse between 1 & 3 kilometres in the first time period. However, if we ran our simulation (rolled the dice) 20 times (iterations), we would expect at least one rabbit to be predicted to have travelled ~20km.

Selecting a dispersal kernel is usually done based on existing literature, expert advice or by plotting available data. The different types of probability density functions you can select for your dispersal kernel specify different expected shapes (Figure 4) of histograms related to how far an individual rabbit (in our example) might travel while dispersing over the specified time period. These different shapes indicate how many rabbits are expected to stay close to where they started and most rabbits in most distributions stay close. However, there are some important questions that change the shape of our probability density function. First, are there only 2 of 100 rabbits that disperse 10 to 20 kilometres away, or do 8 of 100 rabbits disperse 10 to 20 kilometres away (a ticker-tailed distribution)? Cauchy or Weibull distribution, for example, are distributions with thicker tails, which would likely be a better option if 8 or more rabbits travel a relatively long distance. In the centre of the distribution, do nearly all rabbits stay very close (Cauchy), or do quite a few rabbits go a little further away (Gaussian)? The most common way the shape of this distribution is estimated is by taking a lot of data on how far rabbits disperse and plotting those data, but occasionally there are some well-established expected distributions for some data, so it is worth reviewing the literature. Check out this useful guide on how to know which probability density function to choose. You can quickly generate online interactive views of how the plot of selected PDFs might change as the parameters that define the shape of the functions are changed here or here.

*Figure 4**. Example of three probability density functions available to define a dispersal kernel on Biosecurity Commons including Cauchy, Exponential and Gaussian. All three indicate that each time your simulation model runs most individuals disperse only a short distance at that time, but occasionally an individual might disperse a relatively long distance away from the original location. A Cauchy distribution indicates that a greater proportion of individuals are likely to disperse only a short distance, but the tails of the distribution are thicker (often called **heavy-tailed** distributions). In other words, while relatively few individuals disperse a long distance, relatively more individuals disperse a long way in a Cauchy distribution than in a Gaussian (the normal) distribution.*

**Population Models**

The platform provides three different options for modelling the population of an invasive species. The simplest option is the presence only model that ignores population structure and produces a binary output indicating only the presence and absence of an invasive species at a location. The other two options in the platform allow for the simulation of dispersal using a simple unstructured population, or for the more complex simulation of a stage (or age) structured population.

If an organism's population is growing in each cell (grid model) or each node (network model) while it spreads, then it might be worth exploring the addition of a population growth model to the dispersal model because as a population increases in one area the probability of it spreading to adjacent areas often increases. If we were to assume reproductive rates are constant (rarely true) and that available resources (food, habitat) was constant (also rarely true) then we might expect the growth of a population in a new area to follow a ‘sigmoidal’ (S-shaped) curve.

For example, if two invasive birds that breed once a year move into a new area and let us pretend any female of this species only produces two young each year that are always comprised of one male and one female which can breed after the first year. After the first year (t_{1}), there would be four birds with two reproductive females, the female parent and the daughter. However, in (t_{2}) the young may have dispersed to areas within the location of interest, but they might not breed in year 2 because they don’t find a mate. There are other reasons populations will grow slowly at first, but it is common. This is why the bottom of the ‘S’ curve is flatter, as in the early (in our case years) the population tends not to grow at its maximum possible rate. When the population is large enough 4000 birds will produce 4000 young for a total population at a future time period i (t_{i}). This exponential rate of population growth may continue until there is not enough food or nesting sites to allow the population to continue to double each year. This point at which a population stops growing is called ‘carrying capacity’ and is noted by the term K in modelling formulas. The maximum size of a population that an area can support could relate, for example, to both increases in the rate of deaths (mortality) related to a lack of food and reduced reproduction due to a lack of breeding sites.

*Figure 2**. Depiction of the ‘sigmoidal’ (S-shaped) model of population growth (from the **bioninja**) where in a new location a population is slow to grow because there are few reproductive individuals but the rate of increase in the population size eventually speeds up to exponential growth before resources in an area are insufficient to support a continued increase in individuals (at carrying capacity – K). The wavy line around carrying capacity over time relates to variation in both births and deaths as well as changes in local conditions.*

Nature almost never permits a population to grow at a perfect ‘sigmoidal’ growth rate from year to year because breeding or death rates can vary from year to year based on changing environmental conditions (good wet season vs drought) or disease etc. Still, sigmoidal growth is often an accurate depiction of what happens from year to year on average and might apply best to bacteria or viruses.

An unstructured population model is one in which the rates of survival and fertility are equal among all individuals. In the available Biosecurity Commons workflow users can specify ‘lambda’ or the population growth rate (i.e. 1.2 = 20% growth per time step) and the option to add an estimate of carrying capacity (K) for each location of interest by uploading a raster that matches the study region for a grid model, or a csv file if using a network model. The carrying capacity often varies from location to location based on the availability of resources for the organism of interest.

Alternatively, users can select a structured population model which requires users to specify a transition matrix that indicates how many individuals will be added to the population at each time step (Fecundity) and the probability of surviving to the next time step. For example, if we have an invasive tree and we are measuring our timesteps in years, we might assume that all incursions into adjacent areas are seeds that have blown or been transported into a new cell. A diffusion model might estimate this number. Let us say at t_{1} 100 seeds invade a new cell (adjacent area) and no new seeds are produced that year in this new cell where the seeds blew in F_{1} = 0. Then at t_{2} only 20% (0.2) survive to become sprouts and zero new seeds are produced (F_{2} = 0). This continues until finally at year 20+ lots of new seeds are produced (so the transition matrix would have 20 columns and 20 rows by this point and F_{20} might be a pretty big number). It is possible that only a couple of those original 100 trees survived long enough to produce seeds. As with the unstructured population model you can also specify the carrying capacity (K) for each location. This tree example is just one silly example of how you might specify a transition matrix in Biosecurity Commons where in this case annual Fecundity would make up the top row, and the diagonal would consist of the probabilities of going from one time period (state) to the next (in our case each year). Often the stages would be something more sensible like seed, sapling, young tree, breeding tree, and senescent tree.

**Further reading on different model types**

Table 1. List of literature that discusses or exemplifies the different model types available in Biosecurity Commons

## Types of dispersal pathway (Wilson *et al.* 2009)

**Figure 5. **Descriptions & characteristics of different types of dispersal pathway (Wilson *et al. *2009), though authors note real systems most human assisted invasions are likely facilitated by multiple pathways.

## Which model for which kind of pathway

It is not uncommon for multiple pathways, vectors and models to be employed in an overall analysis of national invasion. While the best modelling approach will relate to the organism’s movement patterns, the number of locations the organism has established itself, the organism’s ecology, and the available data or knowledge, some pathways are likely to be well suited for a subset of models.

For example, **leading-edge dispersal **where you are just interested in tracking the spreading presence of the invasive organism is probably best represented by a grided diffusion model (see below for the variety of ways that might be done). Added complexity to the diffusion model could be added by considering the population in each grid cell using a structured or unstructured model, or by adding a direction function, or by adding an attractors layer.

**Corridors **are probably easiest to represent with a “Network Model”, where the presence of a corridor connects separate patches. Estimates of how attractive each patch is and perhaps even weights capturing the relative ease of moving through different kinds of corridors could then be added in a “Dispersal Gravity” model which would also reduce predicted movement between more distant patches. If understanding how rapidly organisms might spread through corridors, it may be worth considering a grid approach where areas outside the patches of suitable areas or the corridors are masked out (excluded) from the overall grid. Then a dispersal kernel might be best if the spread is uneven, while a diffusion model might be best if the spread is taking place along a continuous front.

**Jumps **are infrequent events where an individual completes a long-distance dispersal, often allowing the individual to cross an unsuitable patch or area. Birds, large animals, or individuals carried by floods or strong currents are examples where jumps are perhaps more likely. If there is a discreet area an organism might spread to that is of concern, a farm or a national park, a network model might be suitable, but jumps can also speed up the rate of spatial spread in continuous grids. Jumps in either network or grid models can be predicted with simulations using a long-tail or wide-tailed dispersal kernel.

**Extreme long-distance dispersal **is usually very rare, and of huge distances, think of a bird with a virus being blown to a new area by a hurricane, or something floating across to oceans on rafts of organic or inorganic debris. Modelling these events is hard unless you have a decent sample size of these rare dispersals in order to decipher some patterns. Again, these are very hard to predict, but specifying a “long-tailed distribution” might shed some light on the possibilities, especially if there are other variables aside from a dispersal kernel being used in a model.

Both Jumps and extreme-long distance dispersal can be somewhat more common through human vectors. For example, a truck driver might get some seeds on his socks, and then drop a few seeds when they stop on the other side of a near-by lake (jump). However, if they then drive from Victoria to Broome those seeds might also get to Broome, resulting in a dispersal distance that would not occur naturally (extreme-long distance). A network model might be one way to explore these kinds of possibilities where roads are used as the corridors connecting patches where truck drivers stop and where the seeds could grow. These kinds of dispersal events might be best-added to a model of more common dispersal patterns, by for example creating a separate model with a long-tailed or heavy-tailed dispersal kernel. These rare events are often of interest because long-distance dispersal events can lead to the unexpected colonization of new locations. Surveillance budgets are not unlimited, but with care and many simulation iterations, likely long-distance dispersal locations might be identified.

**Mass dispersal** can be very challenging to contain and many models might be required to understand potential dispersal of an organism that is arriving at multiple locations. The spread of the European black rat in Australia is a good example of an invasive species that likely came from multiple European locations and arrived at multiple Australian ports before spreading further in Australia through a variety of vectors.

**Cultivation **is the most common way invasive weeds are introduced to new areas in Australia, often as ornamental or practical garden plants or less commonly for agricultural purposes. A network model might be a useful way to predict where new exotic plants might be spread through sales, but a variety of models might be most appropriate for subsequent spread beyond gardens and farms.

## Which model to use for different vectors?

A literature review will likely help determine the kinds of vectors used by the organism of interest as well as the kinds of models that have already been used successfully for similar species and/or vectors. Nonetheless, some models are more commonly used for some vectors. For example, both a wind vector or a water vector in areas of continuous suitable habitat tend to use a combination of a grid model with either a diffusion model or a dispersal kernel. If patches of suitable areas or lakes are discreet and spread out but can be reached by wind or streams connecting lakes, a network model with a gravity model might be more appropriate. When animals or humans are carriers of the pests if those carriers are in high densities a grid-based model with a diffusion or dispersal model might be appropriate. If these carriers are travelling long-distances, then it might be more appropriate to use a network model where the nodes are the locations the carriers stop in a way that might disperse the pest. Again remember, that the best possible model might not be possible to construct if there is insufficient knowledge or data.

## Step 1 – Review available evidence, knowledge and data

Keep in mind it is easy to get a result in the dashboard tools available on Biosecurity Commons, but if you review the literature, speak with experts throughout the process, and take time to input the best available data; the resulting final dispersal model should help build an understanding of the most likely areas an invasive species is likely to spread. A good model might also simply confirm what experts expect, with perhaps a few new things to consider, but the result can also provide a credible business case for further work. Cleaning and collating data is often one of the most important steps in any model because quality data that is fit for purpose always improves results. However, you can spend more time on quality than is required by your model. For example, if planning to use a grid model with grid cells that are 1km by 1km in size, taking time to ensure every data point is spatially accurate to within 10m is unlikely to help your model much, although it might be useful for other models. If you are not sure, ask an expert about where extra effort with input data will have the most impact on your results.

## Step 2 – Select ‘Model type’ (grid, network, or spatially implicit)

**Grid Model**

The default grid for the ‘**Study Region**’ is one that covers the whole of Australia in 1km by 1km grid cells projected in GDA94/ Australian Albers (for more information on projections see here & here). You can upload or any grid you want to as long as it is in GeoTiff format and has the imbedded information on the coordinate reference system (CRS), extent, and resolution. You can also load any of the grids available on Biosecurity Commons including those that have been shared with you or that you uploaded previously. Make this decision carefully as the total size of the study area and the size of cells (resolution) or the number of patches can have important impacts on the results and the computation time.

The ‘**Aggregate Factor**’ and **‘Inner Radius’** are available if the number of grid cells in your study region is large and/or dispersal distances are far. If your study region does include many grid cells, the aggregate factor and Inner Radius save computational time and memory. This two-step process allows the user to set a circle of cells around each occupied cell that maintain the original resolution of the study region (default 1km^{2}), and then each cell beyond a specified radial distance (inner radius) is then made larger. How much larger depends on the aggregate factor setting. The default value is 5, so the default resolution would go from a 1km^{2} resolution to a 5 by 5 km grid or 25km^{2}. If a dispersal event at any time step extends past the inner radius cells, then a random 1km^{2} resolution cell is selected randomly from within the 25km^{2} cell that was selected as a dispersal location at that time step. Setting the distance of the inner radius to what you consider the boundary between local dispersal and long-distance dispersal is a reasonable approach as long-distance dispersal events are likely relatively rare anyway.

If using a diffusion model, we highly recommend you keep your ‘inner radius’ longer than the maximum dispersal distance at each time step.

In some grid models, these two settings could reduce the precision of estimated dispersal locations, but if settings are not set to extreme values, any loss of resolution will be overcome by ensuring a relatively large number of iterations are conducted. The default value is 50,000m that when combined with the default ‘Aggregate Factor’ of 5 might take a long time to run for all of Australia at the default 1km^{2} resolution. If you are unsure about the trade-off between the speed of your model and precision, but you think it might be important, either check the literature or try to run the model multiple times but only change these two values. How sensitive (sensitivity analysis) were results or computation time to those changes?

**Network Model**

Here the only required input is a list of the coordinates (latitude and longitude, in WGS 84 decimal degrees) of each patch or node in the network (i.e. farm, national park, trail head, port etc.). These coordinates can be uploaded as a csv file with the column heading of ‘lon’ & ‘lat’, with no missing rows of data. Once uploaded you can see the locations you uploaded on a map of Australia.

**Spatially Implicit model**

There are no additional steps here. These kinds of models are great for exploring scenarios or theory, but do not require spatial data or produce mapped outputs.

## Step 3 – Initialize – select starting values for your model

Define which cells are initially populated with the pest?

For a grid model this requires a raster (GeoTiff file if uploading your own) with values in the grid indicating where the organism is already present, or with population estimates for each cell where it is present at the start of the simulation. You can set the type of initializer type as either based on an “initial_layer” or random to select one location to start.

For a network model there is again the options to specify your added input as an “initial_layer” or a “random”, and in this case you would upload a csv.

No options are available at this step for a spatially implicit model.

## Step 4 – Population growth options

How does the within-cell, within patch or node, or total abundance of the pest change over time?

**Grid model**

**‘Presence_only’**allows a user to upload a grid that includes values between zero and one that represents the probability that a pest would find suitable conditions at that location (‘**Establishment Probability Raster**’). This could be the “Pest Suitability” layer generated as part of a risk map for the specie or could be an independent species distribution model. The user also can select the ‘**Spread Delay**’ (number of time steps of delay) that is between**‘unstructured’**also**Establishment Probability Raster**’(see above). Here a user can also add a value for*Lambda*(or population growth rate), as well as a raster that provides an estimate of the carrying capacity for each cell in the grid (raster).**‘Stage_structured’**also**Establishment Probability Raster**’ (see above). Then the user can fill in a**stage-based transition matrix****Capacity Stages**’ is asking which stage(s) has capacity limited growth, i.e., there is no capacity limit to the number of seeds that may accumulate within, or move into, a cell, but there would be a capacity limit for the number of full-grown trees that could fit in a cell. '**Incursion Stages**’ is asking which stage allows for incursions from individuals of that age/stage, in other words, n plants, the seed stage would get a tick, but other stages would only get a tick if seedlings, saplings, trees etc were being transplanted to new locations.

**Network model**

**‘Presence_only’**again allows a user to upload a csv that includes values between zero and one that represents the (‘**Establishment Probability Points**’) (like an Establishment Probability Raster, but with location data - see above). The user also can select the ‘**Spread Delay**’ (number of time steps of delay) there is between**‘unstructured’**also**Establishment Probability Points**’ (see above). Here a user can also add a value for*Lambda*(or population growth rate), as well as a csv (“Capacity Points”) that provides an estimate of the carrying capacity for each node or patch in the network.**‘Stage_structured’**also**Establishment Probability Points**’ (see above). Then the user can fill in a**stage-based transition matrix****Capacity Stages**’ is asking which stage has capacity limited growth (see above). '**Incursion Stages**’ is asking which stage allows for incursions from individuals of that age/stage (see above).

**Spatially implicit model**

**‘Presence_only’**model type allows the user to select the number of time steps of delay there is between**‘unstructured’**population option here only allows the addition of*Lambda*(or population growth rate).**‘Stage_structured’**offers the first step to add a stage-based transition matrix where the number of columns indicates the number stages and fecundity for each stage makes up the top row, and the probability of transitioning to the next stage makes up the diagonal. ‘**Capacity Stages**’ is asking which stage has capacity limited growth (see above). '**Incursion Stages**’ is asking which stage allows for incursions from individuals of that age/stage (see above).

## Step 5 – Dispersal models

When and how might the pest population spread between cells, nodes or patches, or overall?

## Kernel dispersal** **

In Biosecurity Commons allows users to select different distributions to capture how far an organism might disperse at any time step.

**Grid Model**

**Proportion**requires the user to specify how much spread there is from one cell to adjacent cell(s).

- Most important to set this for
**unstructured or staged population**models. This number sets the proportion of a cells population that disperses at each time step. - If using a
**presence-only population model**this number scales the number of times an organism will disperse to adjacent cells at any time step. In other words, this is the fraction of within-range cells that an organism will disperse to at each time step. - If using an unstructured or stage-structured population model this is simple the proportion of the population that disperses at each time step.

**Events**require the user to select the mean number of events and the actual number used will be selected using a Poisson-PDF with the mean you specify.

- The mean number of dispersal events (dispersal destination locations) from each occupied location at each time step. If this value is set as NULL, and you are using an
**unstructured or staged population**model a destination will be set for each individual in the population. For example, if you select a mean of 3 events, then for each cell you will usually get 2 to 4 dispersal events that go to an adjacent cell, but occasionally you might get zero or 6 leaving from a cell and rarely you might get 10 or more (as expected by the Poisson density distribution function). - If using an unstructured or stage-structured population model a destination is selected for each individual within the source cells’ population.

**Distance Function**Here you get to select one of the available kernel probability distribution functions (PDFs) and define the shape of the function for determining how the likelihood of dispersal declines with distance. Note: there is more than one way to capture the shape of a PDF plot, the parameters selected here are likely easier to estimate than some of the others. You can quickly generate online interactive views of how the plot of selected PDFs might change here or here.

- Beta - If a
**beta-probability-distribution**is selected, the user then can select the shape of the beta-PDF using the beta-PDF parameters of alpha, beta and the lower bound is always assumed to be zero in a distance kernel, but the upper bound can change. You will need to add the parameters, alpha, beta and the upper bound, in this case the maximum dispersal distance (“Max Distance”). - Cauchy - The shape of the Cauchy distribution on Biosecurity Commons is specified by entering the ‘’Max Distance” you would expect an organism to disperse at any time step and also entering a “Scale” parameter which is half the probability distribution’s width for a point that is half the probability distributions maximum height. Again, here is a good place to explore how that looks and changes the shape.
- Exp – The exponential probability distribution is defined in Biosecurity Commons with simply the “Max Distance” you would expect an organism to disperse to at any time step and then entering the "Mean” distance you would expect any dispersal event to travel.
- Gaussian or also known as the ‘normal distribution’ is defined at Biosecurity Commons by the “Max Distance” and the “SD” standard deviation.
- Log-normal is defined by the “Max Distance”, the “Mean”, and the “SD” (standard deviation.
- Weibull has a shape defined by the “Max Distance” a “Weibull shape factor” also called beta or Weibull slope, and Scale
- Lookup - a table of values defining the probability of distances. This may be used to represent a user-defined kernel function.
- If you are struggling how to know which probability density function to choose for your dispersal kernel try reviewing the literature to find an dispersal example that is close to yours (geographically, organism, vector etc.), consult an expert who may have knowledge not in the literature and/or consult a statistician. You may also plot any existing dispersal distance data, but keep in mind if available data are biased the plot may not be representative of your specific case.

If you are unfamiliar with probability distribution functions (PDF) here are links to more reading. Probability density function, heavy-tailed distributions 1 & 2, Beta __distribution__, long-tailed distributions, spatial population models with different PDFs, All dispersal functions are wrong, but many are useful Bullock *et al. *2017. You can quickly generate online interactive views of how the plot of selected PDFs might change as the parameters that define the shape of the functions are changed here or here.

**Direction Function**If using a direction function in a dispersal model in order to specify the probability of dispersal happening in one direction over another, i.e., for wind dispersal if there is a dominant wind direction the user can define the probabilities of directional dispersal by either selecting a**beta-probability-distribution**or a table of values defining the probability of directions. If a beta-probability-distribution is selected, the user then can select the shape of the beta-PDF using the beta-PDF parameters of alpha, beta and the lower bound is always assumed to be zero in a directional kernel, but the upper bound can change. If we were to model the wind direction and measured it in degrees, varying between the default lower bound of zero, the user might input an upper bound of 360. If the wind blew predominantly from the south, we might set alpha and beta both to 2. If the wind tended to be mostly from the west alpha =8 & beta = 2, or from the east alpha = 2 & beta = 8. Fortunately, recommended parameters for beta-PDF regarding the direction of vectors like wind are also in the literature.**Attractors**There are three types of attractors : a source (a location generating many organisms ready to disperse elsewhere), a destination (an attractive location where many organisms are more likely to end up), or both a source and a destination. Attractors that are fixed in space could include farms and the scale of the attractor might relate to the number of deliveries that could have the invasive pest (this would be a fixed destination attractor).**Permeability Raster**is an optional grid (raster) where each cell value represents how relatively easy or hard it is to pass into or through that cell or between patches in a network model. Permeability of each cell is scalled between zero and one. A value of 0 would indicate the cell or path between patches cannot be passed through. If an organism was expected to travel 1000m but passed through a permeability cell with a volue of 0.5, then that organism will only travel 500m.

**Network Model**

**Proportion**(same as in the grid model above) except locations here are network patches or nodes.**Events**(same as in the grid model above) except locations here are network patches or nodes.**Distance Function**(same as in grid model above)**Direction Function**(same as in grid model above)**Attractors**(same as in grid model above)**Network Weights**is the only model option that is different from the above grid model. Network weights can be uploaded as a csv file with node indices and path weights between each pair of connected nodes.

**Spatially implicit model **

**Proportion**(same as in the grid model above) except locations are not explicit.**Events**(same as in the grid model above) except locations are not explicit.**Distance Function**(same as in grid model above)**Direction Function**(same as in grid model above)**Attractors**(same as in grid model above) except attractors would be indexed to the same location indices, spatial CSVs or GeoTiffs would not be used here.**Network Weights**(same as in the network model above) but weights would need to be applied to location indices.

## Dispersal diffusion

**Grid Model**

**Diffusion Rate**The default value here is NULL, but entering a number here simply sets the speed of diffusion in meters per time step, or the speed at which an organism spreads to adjacent cells when other model setting result in dispersal from that cell in a given time step.**Proportion**requires the user to specify how much spread there is from one cell to adjacent cell(s).- If using a
**presence-only population model**this number scales the number of times an organism will disperse to adjacent cells at any time step. For example, a value of 0.01 would indicate that dispersal out of any cell at any timestep to some adjacent cell will only occur 1% of the time. A value such as 5 will indicate that at each time step an occupied cell will have an organism spread to 5 previously unoccupied cells. - If using an unstructured or stage-structured population model this is simple the proportion of the population that disperses at each time step.
**Direction Function**If using a direction function in a presence-only model in order to specify the probability of dispersal happening in one direction over another, i.e., for wind dispersal if there is a dominant wind direction the user can define the probabilities of directional dispersal by either selecting a**beta-probability-distribution**or a table of values defining the probability of directions. If a beta-probability-distribution is selected, the user then can select the shape of the beta-PDF using the beta-PDF parameters of alpha, beta and the lower bound is always assumed to be zero in a directional kernel, but the upper bound can change. If we were to model the wind direction and measured it in degrees, varying between the default lower bound of zero, the user might input an upper bound of 360. If the wind blew predominantly from the south, we might set alpha and beta both to 2. If the wind tended to be mostly from the west alpha =8 & beta = 2, or from the east alpha = 2 & beta = 8. Fortunately, recommended parameters for beta-PDF regarding the direction of vectors like wind are also in the literature.**Attractors**Two types of attractors include those that are fixed in space and those that vary as conditions change after each time step. Then there are three other types of attractors which relate to whether the attractors are a source (a location generating many organisms ready to disperse elsewhere), a destination (an attractive location where many organisms are more likely to end up), or both a source and a destination. Attractors that are fixed in space could include farms and the scale of the attractor might relate to the number of deliveries that could have the invasive pest (this would be a fixed destination attractor).**Permeability Raster**is an optional grid (raster) where each cell value represents how relatively easy or hard it is to pass into or through that cell.

**Network Model**

**Diffusion Rate**(same as the grid model above except the locations are patches or nodes in the network)**Proportion**(same as the grid model above except the locations are patches or nodes in the network)**Direction Function**(same as in grid model above)**Attractors**Two (same as in grid model above)**Permeability Raster**is an optional grid (raster) where each cell value represents how relatively easy or hard it is to pass into or through that cell.**Network Weights**is the only model option that is different from the above grid model. Network weights can be uploaded as a csv file with ‘lat’, ‘lon’ and weights. The weights relate to the relative influence of each patch or node in the network.

**Spatially implicit model**

**Diffusion Rate**(same as the grid model above except spatially implicit)**Proportion**(same as the grid model above except spatially implicit)**Direction Function**(same as in grid model above)**Attractors**Two (same as in grid model above) except attractors would be indexed to the same location indices, spatial CSVs or GeoTiffs would not be used here.

## Dispersal gravity

**Grid Model**

**Proportion**requires the user to specify how much spread there is from one cell to adjacent cell(s).

- If using a
**presence-only population model**this number scales the number of times an organism will disperse to adjacent cells at any time step. For example, a value of 0.01 would indicate that dispersal out of any cell at any timestep to some adjacent cell will only occur 1% of the time. A value such as 5 will indicate that at each time step an occupied cell will have an organism spread to 5 previously unoccupied cells. For presence-only grids select either ‘Proportion’ or ‘Events’ not both. - If using an unstructured or stage-structured population model this is simple the proportion of the population that disperses at each time step.

**Events**require the user to select the mean number of events and the actual number used will be selected using a Poisson-PDF with the mean you specify.

- If using a
**presence-only population model**this value indicates the mean number of dispersal events (dispersal destination locations) for each location at each time step and random sampling of possible destinations is used for Presence-only sampling. For example, if you select a mean of 3 events, then for each cell you will usually get 2 to 4 dispersal events that go to an adjacent cell, but occasionally you might get zero or 6 leaving from a cell and rarely you might get 10 or more (as expected by the Poisson density distribution function). For presence-only grids select either ‘Proportion’ or ‘Events’ not both. - If using an unstructured or stage-structured population model a destination is selected for each individual within the source cells’ population.

**Direction Function**If using a direction function in a presence-only model in order to specify the probability of dispersal happening in one direction over another, i.e., for wind dispersal if there is a dominant wind direction the user can define the probabilities of directional dispersal by either selecting a**beta-probability-distribution**or a table of values defining the probability of directions. If a beta-probability-distribution is selected, the user then can select the shape of the beta-PDF using the beta-PDF parameters of alpha, beta and the lower bound is always assumed to be zero in a directional kernel, but the upper bound can change. If we were to model the wind direction and measured it in degrees, varying between the default lower bound of zero, the user might input an upper bound of 360. If the wind blew predominantly from the south, we might set alpha and beta both to 2. If the wind tended to be mostly from the west alpha =8 & beta = 2, or from the east alpha = 2 & beta = 8. Fortunately, recommended parameters for beta-PDF regarding the direction of vectors like wind are also in the literature.**Attractors**Two types of attractors include those that are fixed in space and those that vary as conditions change after each time step. Then there are three other types of attractors which relate to whether the attractors are a source (a location generating many organisms ready to disperse elsewhere), a destination (an attractive location where many organisms are more likely to end up), or both a source and a destination. Attractors that are fixed in space could include farms and the scale of the attractor might relate to the number of deliveries that could have the invasive pest (this would be a fixed destination attractor).**Permeability Raster**is an optional grid (raster) where each cell value represents how relatively easy or hard it is to pass into or through that cell.**Beta**describes an exponential decay distance function where the value of beta will define the rate at which distance will slow in having an attractive influence on organisms. The formula for beta in this context is: beta = log(proportion) / distance, where proportion (value between 0 & 1) = the proportion of the PDF of exponential decay, and the distance = the distance away from the source where that proportion is reached. A common use of this decay function is to set the proportion to 0.5 in what is known as a half-decay function. For example, log(0.5)/200km captures half the likelihood in a negative exponential decay function at 200km away. log(0.95)/200 would yield a function where 95% of the area under the curve was reached at 200 km.

A.

B.

**Figure 6.** Examples of exponential decay functions for the influence of attractors in dispersal models. A. beta = log(0.5)/100km or half the area under the curve is < 100 km away from the attractor; B. beta = log(0.5)/300km or half the area under the curve is < 300 km away from the attractor.

**Distance Scale**The default scale used in Biosecurity Commons is in meters, but if you wanted to use kilometres you would enter 0.001 here.

**Network Model**

**Proportion**(same as the grid model above except the locations are patches or nodes in the network)**Events**(same as the grid model above except the locations are patches or nodes in the network)**Direction Function**(same as in grid model above)**Attractors**(same as in grid model above)**Network Weights**is the only model option that is different from the above grid model. Network weights can be uploaded as a csv file with ‘lat’, ‘lon’ and weights. The weights relate to the relative influence of each patch or node in the network.**Beta**(same as in grid model above)**Distance Scale**(same as in grid model above)

**Spatially implicit model**

**Proportion**(same as the grid model above except spatially implicit)**Events**(same as the grid model above except spatially implicit)**Direction Function**(same as in grid model above)**Attractors**(same as in grid model above)**Beta**(same as in grid model above)**Distance Scale**(same as in grid model above)

## Step 6 – Simulator

temporal resolution – daily, weekly, monthly, annually

- Time Steps (how many discrete time steps do you want to predict into during the simulation?)
- Step Units (What is the unit of measure for the step duration below which may include 1 or more ‘time steps’?)
- Step Duration (how many ‘time steps’ fit into the step duration above?)
- Collation Steps (how many time steps do you want
- Replicates (the number of times to repeat the entire simulation, keep in mind that most of the decisions on where organisms disperse to in each simulation are chosen based on probabilities, so results will change from one simulation to the next. To get an overview of patterns of possible dispersal you will almost always run simulations multiple times, 1,000+ times if you have probabilities with wide-tailed or long-tailed distributions.)

To get an idea of how far an organism might spread you will often select multiple time steps, and you will almost always select multiple replicates.

Here is one example of why you might alter some of the other values in this section. Let’s say you select 120 time steps. If you leave step units as ‘years’ and leave step duration at one the model will run for 120 years. If you instead change the step duration to 12, each time step will be a month-long and it will run for 10 years. With these same setting if you then select collation steps at 60 you will get one result at five years and one at ten years, but all the other steps will remain unseen. If Time Steps were set to 1, you would get 120 results (too many I think), but if you set ‘Collation steps’ to the same number as Time Steps you would get just the end result, in our case after 10 years. That result would then be averaged for the number of simulations you selected.

## Literature cited:

Adeva, J. G., Botha, J. H., & Reynolds, M. (2012). A simulation modelling approach to forecast establishment and spread of Bactrocera fruit flies. *Ecological Modelling, 227, 93-108.*

Andow. D.A., Kareiva, P. M., Levin, S. A., & Okubo, A. (1990). Spread of invading organisms. *Landscape **Ecology*, *4*(2/3), 177-188.

Baxter, P., Woodley, A., & Hamilton, G. (2017). Modelling the spatial spread risk of plant pests and pathogens for strategic management decisions. In *Proceedings of the 22nd International Congress on Modelling and Simulation (MODSIM2017)* (pp. 209-215). Modelling and Simulation Society of Australia and New Zealand Inc.(MSSANZ).

Beverton, R. J., & Holt, S. J. (2012). *On the dynamics of exploited fish populations* (Vol. 11). Springer Science & Business Media.

Bianchi, F. J. J. A., Schellhorn, N. A., & Van Der Werf, W. (2009). Predicting the time to colonization of the parasitoid Diadegma semiclausum: the importance of the shape of spatial dispersal kernels for biological control. *Biological Control, 50(3), 267-274.*

Bossenbroek, J. M., Kraft, C. E., & Nekola, J. C. (2001). Prediction of long‐distance dispersal using gravity models: zebra mussel invasion of inland lakes. *Ecological Applications, 11(6), 1778-1788.*

Bradhurst, R., Spring, D., Stanaway, M., Milner, J., & Kompas, T. (2021). A generalised and scalable framework for modelling incursions, surveillance and control of plant and environmental pests. *Environmental Modelling & Software*, *139*, 105004.

Bullock, J. M., Hooftman, D. A., Tamme, R., Götzenberger, L., Pärtel, M., Mallada González, L., & White, S. M. (2018). All dispersal functions are wrong, but many are useful: A response to Cousens et al. *Journal of Ecology*, *106*(3), 907-910.

Carrasco, L. R., Harwood, T. D., Toepfer, S., MacLeod, A., Levay, N., Kiss, J., ... & Knight, J. D. (2010). Dispersal kernels of the invasive alien western corn rootworm and the effectiveness of buffer zones in eradication programmes in Europe. *Annals of Applied Biology, 156(1), 63-77.*

Carrasco, L. R., Mumford, J. D., MacLeod, A., Harwood, T., Grabenweger, G., Leach, A. W., ... & Baker, R. H. A. (2010). Unveiling human-assisted dispersal mechanisms in invasive alien insects: integration of spatial stochastic simulation and phenology models. *Ecological Modelling, 221(17), 2068-2075.*

Clark, J. S., Fastie, C., Hurtt, G., Jackson, S. T., Johnson, C., King, G. A., ... & Wyckoff, P. (1998). Reid's paradox of rapid plant migration: dispersal theory and interpretation of paleoecological records. *BioScience, 48(1), 13-24.*

Clark, J. S., Silman, M., Kern, R., Macklin, E., & HilleRisLambers, J. (1999). Seed dispersal near and far: patterns across temperate and tropical forests. *Ecology, 80(5), 1475-1494.*

Crespo-Pérez, V., Rebaudo, F., Silvain, J. F., & Dangles, O. (2011). Modeling invasive species spread in complex landscapes: the case of potato moth in Ecuador. *Landscape ecology, 26(10), 1447-1461.*

Dana, E. D., Jeschke, J. M., & García-de-Lomas, J. (2014). Decision tools for managing biological invasions: existing biases and future needs. *Oryx, 48(1), 56-63.*

Faulkner, K. T., Robertson, M. P., & Wilson, J. R. (2020). Stronger regional biosecurity is essential to prevent hundreds of harmful biological invasions. *Global Change Biology, 26(4), 2449-2462.*

Fisher, R. A. (1937). The wave of advance of advantageous genes. *Annals of eugenics, 7(4), 355-369.*

Fordham, D. A., Haythorne, S., Brown, S. C., Buettel, J. C., & Brook, B. W. (2021). poems: R package for simulating species' range dynamics using pattern‐oriented validation. *Methods in Ecology and Evolution, 12(12), 2364-2371.*

Gilbert, M., Grégoire, J. C., Freise, J. F., & Heitland, W. (2004). Long‐distance dispersal and human population density allow the prediction of invasive patterns in the horse chestnut leafminer Cameraria ohridella. *Journal of Animal Ecology, 73(3), 459-468.*

Gippet, J. M., Liebhold, A. M., Fenn-Moltu, G., & Bertelsmeier, C. (2019). Human-mediated dispersal in insects. *Current opinion in insect science, 35, 96-102.*

Gotelli, N. J. (2008). *A primer of ecology* (Vol. 494). Sunderland, MA: Sinauer Associates.

Hastings, A., Cuddington, K., Davies, K. F., Dugaw, C. J., Elmendorf, S., Freestone, A., ... & Thomson, D. (2005). The spatial spread of invasions: new developments in theory and evidence. *Ecology Letters, 8(1), 91-101.*

Jones, C. M., Jones, S., Petrasova, A., Petras, V., Gaydos, D., Skrip, M. M., ... & Meentemeyer, R. K. (2021). Iteratively forecasting biological invasions with PoPS and a little help from our friends. *Frontiers in Ecology and the Environment*, *19*(7), 411-418.

Jongejans, E., Skarpaas, O., & Shea, K. (2008). Dispersal, demography and spatial population models for conservation and control management. Perspectives in Plant Ecology, Evolution and Systematics, 9(3-4), 153-170.

Katul, G. G., Porporato, A., Nathan, R., Siqueira, M., Soons, M. B., Poggi, D., ... & Levin, S. A. (2005). Mechanistic analytical models for long-distance seed dispersal by wind. *The American Naturalist, 166(3), 368-381.*

Kehlenbeck, H., Robinet, C., Van Der Werf, W., Kriticos, D., Reynaud, P., & Baker, R. (2012). Modelling and mapping spread in pest risk analysis: a generic approach. *EPPO Bulletin, 42(1), 74-80.*

Kot, M., & Schaffer, W. M. (1986). Discrete-time growth-dispersal models. *Mathematical Biosciences, 80(1), 109-136.*

Kot, M., Lewis, M. A., & van den Driessche, P. (1996). Dispersal data and the spread of invading organisms. *Ecology, 77(7), 2027-2042.*

Lefkovitch, L. P. (1965). The study of population growth in organisms grouped by stages. *Biometrics*, 1-18.

Leslie, P. H. (1945). On the use of matrices in certain population mathematics. *Biometrika, 33(3), 183-212.*

Lustig, A., Worner, S. P., Pitt, J. P., Doscher, C., Stouffer, D. B., & Senay, S. D. (2017). A modeling framework for the establishment and spread of invasive species in heterogeneous environments. *Ecology and evolution, 7(20), 8338-8348.*

Malchow, A. K., Bocedi, G., Palmer, S. C., Travis, J. M., & Zurell, D. (2021). RangeShiftR: an R package for individual‐based simulation of spatial eco‐evolutionary dynamics and species' responses to environmental changes. *Ecography, 44(10), 1443-1452.*

Muirhead, J. R., Leung, B., van Overdijk, C., Kelly, D. W., Nandakumar, K., Marchant, K. R., & MacIsaac, H. J. (2006). Modelling local and long‐distance dispersal of invasive emerald ash borer Agrilus planipennis (Coleoptera) in North America. *Diversity and Distributions, 12(1), 71-79.*

Nathan, R., Schurr, F. M., Spiegel, O., Steinitz, O., Trakhtenbrot, A., & Tsoar, A. (2008). Mechanisms of long-distance seed dispersal. *Trends in ecology & evolution, 23(11), 638-647.*

Nenzén, H. K., Swab, R. M., Keith, D. A., & Araújo, M. B. (2012). demoniche–an R‐package for simulating spatially‐explicit population dynamics. *Ecography, 35(7), 577-580.*

Neubert, M. G., & Caswell, H. (2000). Demography and dispersal: calculation and sensitivity analysis of invasion speed for structured populations. *Ecology, 81(6), 1613-1628.*

Newton, I. (2007). Population limitation in birds: the last 100 years. *British birds, 100, 518-539.*

Rasmussen, R., & Hamilton, G. (2012). An approximate Bayesian computation approach for estimating parameters of complex environmental processes in a cellular automata. *Environmental Modelling & Software, 29(1), 1-10.*

Ricker, W. E. (1958). Handbook of computations for biological statistics of fish populations. *Can Fish Res Board Bull, 119, 300.*

Robinet, C., Kehlenbeck, H., Kriticos, D. J., Baker, R. H., Battisti, A., Brunel, S., ... & van der Werf, W. (2012). A suite of models to support the quantitative assessment of spread in pest risk analysis. PlosOne

Ruxton, G. D., & Rohani, P. (1999). Fitness‐dependent dispersal in metapopulations and its consequences for persistence and synchrony. *Journal of Animal Ecology, 68(3), 530-539.*

Schneider, D. W., Ellis, C. D., & Cummings, K. S. (1998). A transportation model assessment of the risk to native mussel communities from zebra mussel spread. *Conservation biology, 12(4), 788-800.*

Shaw, M. W. (1995). Simulation of population expansion and spatial pattern when individual dispersal distributions do not decline exponentially with distance. *Proceedings of the Royal Society of London. Series B: Biological Sciences, 259(1356), 243-248.*

Skarpaas, O., & Shea, K. (2007). Dispersal patterns, dispersal mechanisms, and invasion wave speeds for invasive thistles. *The American Naturalist, 170(3), 421-430.*

Skellam, J. G. (1951). Random dispersal in theoretical populations. *Biometrika, 38(1/2), 196-218.*

Suarez, A. V., Holway, D. A., & Case, T. J. (2001). Patterns of spread in biological invasions dominated by long-distance jump dispersal: insights from Argentine ants. *Proceedings of the National Academy of Sciences, 98(3), 1095-1100.*

Visintin, C., Briscoe, N. J., Woolley, S. N., Lentini, P. E., Tingley, R., Wintle, B. A., & Golding, N. (2020). steps: Software for spatially and temporally explicit population simulations. *Methods in Ecology and Evolution, 11(4), 596-603.*

Wilson, J. R., Dormontt, E. E., Prentis, P. J., Lowe, A. J., & Richardson, D. M. (2009). Something in the way you move: dispersal pathways affect invasion success. Trends in ecology & evolution, 24(3), 136-144.