Program EXPANSION: explanations and documentationThe R-script EXPANSION estimates a population's expansion rate from spatio-temporal data on the occurrences of the population. Please note that this page describes version 1.4 of the program, which has been superseded by versions 3.x.
Introduction and installationThe R-script does not require any previous knowledge of R, but presupposes that R has been installed on your computer. A point-by-point instruction follows here: · R is an open and free programming language and environment. It can be downloaded from http://www.r-project.org. Follow the installation instructions at that site to install the package. · After you have installed and started R, the lifetime script can be loaded in one of two ways: · write load(url("http://www.evol.no/hanno/12/expans.rtx")) directly in your R pane (this requires your computer to be online); or · use your browser to navigate to http://www.evol.no/hanno/12/expans.rtx and save this file to your harddisk; later, write load("...") in your R pane, where "..." specifies the file location [for example, load("c:/aliens/expans.rtx"); this requires your computer to be online only when dowloading the file for the first time, whereupon it can be loaded locally from your computer]. · Now you can run the script by writing expansion(...), where "..." represents the parameters, which are explained in detail below. Please note that this R-script is not part of any R package. Therefore, no R help will be available for this function. Please refer to this site instead.
The program requires a dataset
containing the spatio-temporal information about the observed occurrences of the population.
The dataset is specified using the parameter
data. A program call thus has the form
expansion(data=...),
where "..." may be any of the following three objects: (1)
a character string specifying the location of a data file; (2)
a data frame; (3)
a matrix. If you are an R beginner,
you should choose the first option (as the two latter methods presuppose that your data
have already been read to R).
The formatting required for the data file according to option 1 is explained
below. All three options require that the data are organised into
columns that are named precisely as specified in the following paragraphs. ·
One column has to contain years and have the name t.
Years have to be integers. ·
The geographic positions of observations can be specified
using one to six columns, depending on the coordinate system used:
Coordinate systemsPositions of observed occurrences
may be specified in one of five different formats, using one of three different
coordinate systems: ·
Latitute and longitude
(2)
lat
AND
lam
AND
las
AND
lon
AND
lom
AND
los · MGRS coordinates
(Military
Grid Reference System)
(3)
mgrs
OR
(4)
zone
AND
band
AND
id
AND
east
AND
north
OR ·
UTM coordinates (Universal Transverse Mercator) where the variable names have the following
meaning and formatting:
For use in Norway, positions can also be provided in other formats, e.g. in terms of midpoints of municipalities. The explanation of these options in given in Norwegian only.
ExampleThe coordinates of
Tromsø
(69°39'5.0"N
18°57'19.0"E) can thus be specified in the following ways:
(1)
{lat=69.65139;
lon=18.95528}
(2)
{lat=69;
lam=39; las=5.0; lon=18; lom=57; los=19.0}
(3)
{mgrs="34WDC2058828390"}
(4)
{zone=34; band="W"; id="DC";
east=20588; north=28390}
(5)
{zone=34;
east=420588; north=7728390} [The precision of the coordinates in this example is (1) roughly 1.1 m (NS) / 0.4 m (EW), (2) roughly 3.1 m (NS) / 1.1 m (EW), (35) exactly 1 m.]
NB·
Please note that the MGRS system and the UTM system are
often confused (the former is based on the latter). However, both require different formatting.
While Tromsø's UTM coordinates are 34 420588 7728390,
Tromsø's MGRS coordinates are 34WDC2058828390.
The northing of UTM coordinates uses signs in order to distinguish between the
Northern (positive sign) and the Southern Hemisphere (negative sign).
Positive signs may be omitted. ·
If the data do not follow the standards for UTM or MGRS
(as appropriate), the program may misinterpret them. ·
The variable names have to follow the conventions detailed above. ·
Leading zeros may create trouble for northings and eastings
in the MGRS system. To make sure that leading zeros do not "disappear", please save east and north as character strings rather than numbers. ·
The precision of the positions does not matter
(well it may matter for the results, of course,
but not for the interpretation of the coordinates). ·
Different observations in one dataset may use different
coordinate systems. ·
If more than one coordinate system is used, UTM coordinates
are ignored wherever MGRS coordinates are supplied; and MGRS coordinates are ignored
wherever latitude and longitude are supplied. ·
The order of observations does not matter. ·
The order of columns does not matter. ·
Additional columns are ignored.
Formatting of data filesIf the data are read from an external file,
please follow these formatting rules: ·
The data have to be organised column-wise, i.e.
the file has to consist of one column per variable
(year and for instance latitude and longitude) and one row per observation. ·
The first row has to contain the variable names
(see above for the variable names that have to be used). ·
All rows have to have the same number of separators. ·
Missing values are tolerated if specified by omission ("")
or spaces (" "). (Other symbols, such as "?" or "NA", will generate error messages.) ·
Semicola (;) or commata (,) are accepted as separators
between columns (i.e., between the elements of a row) but please don't use both.
Such files can be produced by all spreadsheet applications.
(Choose "save as comma delimited file" or something similar.
Usual filename extensions of such formats are ".CSV" or ".SDV".) ·
The symbol used as separator must not occur in other places.
Nor may apostrophes (') be used anywhere in the data file. Please make sure to remove or replace
these symbols. ·
The parameter data is used to specify
the location of the data file. The location should be specified as a character string
containing the file name and complete location within quotation marks, e.g. expansion(data="c:/aliens/data/species6.sdv").
Please note the use of slash (/) instead of backslash (\). ·
Periods (.) are accepted as decimal marks. Only if semicola (;)
are used as seperators, commata (,) may be used as decimal marks, too. ·
Spaces between (outside) elements, and quotation marks (") enclosing
elements (on both sides), are tolerated.
Example: t;lat;lon
ParametersUse of the parameter
data is explained above.
The remaining parameters are optional and usually not required.
An overview will be provided here shortly. · map (logical variable indicating whether the observations should be shown on a map; the default is map=TRUE; to switch off map view, write map=FALSE), · rmax (initial estimate of the maximum possible distance of expansion; the default is infinity; other values can be specified in kilometres), · quiet (turns off messages and warnings if TRUE; the default is quiet=FALSE), · front (integer that allows to specify how the expansion front should be defined; options available so far are front=0, which estimates expansion from the entire population's average distance from the first observation at any time; front=1, which defines the expansion front as the average of the distances that are at least as large as the previous year's expansion front; front=2, which defines the expansion front as the average of the distances that are at least as large as the single largest distance in the previous year; front=3, which defines the expansion front as the largest distance that is at least as large as the maximum distance observed in the previous year; the default is front=2, which is a rather robust definition under many conditions), · new.obs (logical variable indicating whether occurrences are only reported in the year of their first observation, and should assumed to be present also in later years; if so, ny.obs=TRUE, which is the default; the parameter is ignored if front > 0), · type (integer or vector of integers between 0 and 3, indicating the functional form that is fitted to the data; type=0 fits a linear model, type=1 a truncated model, type=2 an asymptotic model, and type=3 a sigmoid model; if a vector is provided, the respective models are tested in turn, and the results for the model with the lowest AIC are presented; the default is type=0), · output (logical variable indicating whether the function should produce an output consisting of a list of model estimates; defaults to outdata=FALSE), · outdata (logical variable indicating whether the function should produce an output consisting of a data-frame with the locations transformed to x and y distances in kilometres after applying an azimuthal projection; defaults to output=FALSE), · save (logical variable or character string indicating whether the function should save the data to a file after transforming them to latitudes and longitudes; using this option, it is sufficient to transform MGRS or UTM coordinates once, while later calls can directly load the transformed data; this is done if save is either TRUE or a file name; the default is save=FALSE), · det (logical value indicating whether details from all models should be displayed when more than one model is tested; the default is det=FALSE), · the remaining parameters allow overriding some default settings (xy: logical value indicating whether the data are already transformed to an azimuthal projection and expressed in kilometres in x and y direction, using the columns x and y; phi0: latitude of the centre of the azimuthal projection; lambda0: longitude of the centre of the azimuthal projection; language: allows to switch between English and Norwegian output), affect the graphical representation of the course of expansion (alpha: alpha level of the confidence intervals displayed, defaults to 0.05; header: header for the graph; xlab: legend for the x axis; ylab: legend for the y axis; ylim: factor by which the y axis is streched, defaults to 1.5; hmax: number of years shown in addition to the ones with data, defaults to 20; ...: further graphical parameters if desired), or allow parameterisation in Norwegian (where kart = map, hold.munn = quiet, tittel = header, ny.obs = new.obs, typ = type, utmat = output, utdata = outdata, lagre = save, and spraak = language).
OutputThe function does not have any value. Its output is displayed directly on the screen instead. The output starts with of a short summary of the input data (which can be suppressed by letting the parameter quiet=TRUE). The remainder consists of: ·
estimates of the expansion rate
(v ± 95% confidence intervals) and of the standard deviation of the spread distance s,
based on the assumption of no observation error
(i.e., all variation is assumed to be due to process noise); · estimates (± 95% confidence intervals) based on the assumption of no process noise (i.e., all variation is assumed to be due to observation error). Depending on the model chosen, the following parameters may be estimated: ·
expansion rate v in kilometres per year,
·
the time t0
of first introduction as a year, ·
the maximum expansion distance K in kilometres, ·
the time (year) tx specifying the
inflection point of a sigmoid curve, ·
the standard deviation s of the spread distance,
·
the parameter b, which describes the increase
of the variance observed with time, ·
Akaike's Information Criterion AIC.
NB: The output does not use thousands separators. Periods (or commata) in numbers thus signify decimal marks.
About the programThe R-script EXPANSION has been written by Hanno Sandvik with contributions by Jarle Tufto at the Centre for Biodiversity Dynamics (CBD), Norwegian University of Science and Technology (NTNU). The description on this page refers to version 1.4 (June 2012) and is retained for documentation purposes only. It has been superseded by version 2.0 as of December 2016. In case of questions or comments, please contact Hanno Sandvik.
|