Title: | Density and Abundance from Distance-Sampling Surveys |
---|---|
Description: | Distance-sampling (<doi:10.1007/978-3-319-19219-2>) estimates density and abundance of survey targets (e.g., animals) when detection probability declines with distance. Distance-sampling is popular in ecology, especially when survey targets are observed from aerial platforms (e.g., airplane or drone), surface vessels (e.g., boat or truck), or along walking transects. Distance-sampling includes line-transect studies that measure observation distances as the closest approach of the sample route (transect) to the target (i.e., perpendicular off-transect distance), and point-transect studies that measure observation distances from stationary observers to the target (i.e., radial distance). The routines included here fit smooth (parametric) curves to histograms of observation distances and use those functions to compute effective sampling distances, density of targets in the surveyed area, and abundance of targets in a surrounding study area. Curve shapes include the half-normal, hazard rate, and negative exponential functions. Physical measurement units are required and used throughout to ensure density is reported correctly. The help files are extensive and have been vetted by multiple authors. |
Authors: | Trent McDonald [cre, aut], Jason Carlisle [aut], Aidan McDonald [aut] (point transect methods), Ryan Nielson [ctb] (smoothed likelihood), Ben Augustine [ctb] (maximization method), James Griswald [ctb] (maximization method), Patrick McKann [ctb] (maximization method), Lacey Jeroue [ctb] (vignettes), Hoffman Abigail [ctb] (vignettes), Kleinsausser Michael [ctb] (vignettes), Joel Reynolds [ctb] (Gamma likelihood), Pham Quang [ctb] (Gamma likelihood), Earl Becker [ctb] (Gamma likelihood), Aaron Christ [ctb] (Gamma likelihood), Brook Russelland [ctb] (Gamma likelihood), Stefan Emmons [ctb] (Automated tests), Will McDonald [ctb] (Automated tests), Reid Olson [ctb] (Automated tests and bug fixes) |
Maintainer: | Trent McDonald <[email protected]> |
License: | GNU General Public License |
Version: | 4.0.3 |
Built: | 2025-04-01 20:24:50 UTC |
Source: | https://github.com/tmcd82070/rdistance |
Rdistance
contains functions and associated routines to analyze
distance-sampling data collected on point or line transects.
Some of Rdistance
's features include:
Accommodation of both point and line transect analyses
in one routine (dfuncEstim
).
Regression-like formula for inclusion of distance function
covariates (dfuncEstim
).
Automatic bootstrap confidence intervals
(abundEstim
).
Availability of both study-area and site-level abundance
estimates (help("predict.dfunc")
).
Classical, parametric distance functions
(halfnorm.like
, hazrate.like
,
negexp.like
), and
expansion functions (cosine.expansion
,
hermite.expansion
, simple.expansion
).
Automated distance function fits and selection
autoDistSamp
.
print
, plot
, predict
, coef
,
and summary
methods for distance function objects and
abundance classes.
Distance-sampling is a popular method for abundance estimation in
ecology. Line transect surveys are conducted by traversing
randomly placed transects in a study area with the objective of
sighting animals and estimating density or abundance. Data collected
during line transect surveys consists of sighting records for
targets, usually either individuals or groups of individuals. Among
the collected data, off-transect distances are recorded or computed from
other information (see perpDists
). Off-transect distances
are the perpendicular distances from the transect to the location of
the initial sighting cue.
When groups are the target, the number of individuals in the group is
recorded.
Point transect surveys are similar except that observers stop one or more times along the transect to observe targets. This is a popular method for avian surveys where detections are often auditory cues, but is also appropriate when automated detectors are placed along a route. Point transect surveys collect distances from the observer to the target and are sometimes called radial distances.
A fundamental characteristic of both line and point-based distance sampling analyses is that probability of detecting a target declines as off-transect or radial distances increase. Targets far from the observer are usually harder to detect than closer targets. In most classical line transect studies, targets on the transect (off-transect distance = 0) are assume to be sighted with 100% probability. This assumption allows estimation of the proportion of targets missed during the survey, and thus it is possible to adjust the actual number of sighted targets for the proportion of targets missed. Some studies utilize two observers searching the same areas to estimate the proportion of individuals missed and thereby eliminating the assumption that all individuals on the line have been observed.
A detailed comparison of
Rdistance
to other options for distance sampling analysis (e.g.,
Program DISTANCE, R package Distance
, and R package unmarked
)
is forthcoming. While some of the functionality in Rdistance
is not unique, our aim is to provide an easy-to-use, rigorous,
and flexible analysis option for distance-sampling data.
We understand that beginning
users often need software that is both easy to use and easy to understand,
and that advanced users often require greater flexibility and customization.
Our aim is to meet the demands of both user groups. Rdistance
is
under active development, so please contact us with issues, feature
requests, etc. through the package's GitHub website
(https://github.com/tmcd82070/Rdistance).
Rdistance
contains four example data sets: two collected using
line-transect methods (i.e., sparrowDetectionData
and
sparrowSiteData
) and two collected using point-transect
methods (i.e.,
thrasherDetectionData
and thrasherSiteData
).
Main author and maintainer: Trent McDonald <[email protected]>
Coauthors: Ryan Nielson, Jason Carlisle, and Aidan McDonald
Contributors: Ben Augustine, James Griswald, Joel Reynolds, Pham Quang, Earl Becker, Aaron Christ, Brook Russelland, Patrick McKann, Lacey Jeroue, Abigail Hoffman, Michael Kleinsasser, and Ried Olson
Buckland, S.T., Anderson, D.R., Burnham, K.P. and Laake, J.L. 1993. Distance Sampling: Estimating Abundance of Biological Populations. Chapman and Hall, London.
Useful links:
Report bugs at https://github.com/tmcd82070/Rdistance/issues
Estimate abundance (or density) from an estimated detection function and supplemental information on observed group sizes, transect lengths, area surveyed, etc. Computes confidence intervals on abundance (or density) using a the bias corrected bootstrap method.
abundEstim( object, area = NULL, propUnitSurveyed = 1, ci = 0.95, R = 500, plot.bs = FALSE, showProgress = TRUE )
abundEstim( object, area = NULL, propUnitSurveyed = 1, ci = 0.95, R = 500, plot.bs = FALSE, showProgress = TRUE )
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
area |
A scalar containing the total area of inference. Usually, this is
study area size. If |
propUnitSurveyed |
A scalar or vector of real numbers between 0 and 1.
The proportion of the default sampling unit that
was surveyed. If both sides of line transects were observed,
|
ci |
A scalar indicating the confidence level of confidence intervals.
Confidence intervals are computed using a bias corrected bootstrap
method. If |
R |
The number of bootstrap iterations to conduct when |
plot.bs |
A logical scalar indicating whether to plot individual bootstrap iterations. |
showProgress |
A logical indicating whether to show a text-based
progress bar during bootstrapping. Default is |
The abundance estimate for line-transect surveys (if no covariates are included in the detection function and both sides of the transect are observed) is
where n is total number of sighted individuals
(i.e., sum(groupSizes(dfunc))
), L is the total length of
surveyed transect (i.e., sum(effort(dfunc))
),
and ESW is effective strip width
computed from the estimated distance function (i.e., ESW(dfunc)
).
If only one side of transects were observed, the "2" in the denominator
is not present (or, replaced with a "1").
The abundance estimate for point transect surveys (if no covariates are included) is
where n is total number of sighted individuals (i.e., sum(groupSizes(dfunc))
),
P is the total number of surveyed points (i.e., sum(effort(dfunc))
),
and ESR is effective search radius
computed from the estimated distance function (i.e., ESR(dfunc)
).
Setting plot.bs=FALSE
and showProgress=FALSE
suppresses all intermediate output.
Estimation of site-specific density (e.g., on every transect) is accomplished by
predict(x, type = "density")
, which returns a
tibble containing density and abundance on the area surveyed by every
transect.
An Rdistance 'abundance estimate' object, which is a list of
class c("abund", "dfunc")
, containing all the components of a "dfunc"
object (see dfuncEstim
), plus the following:
estimates |
A tibble containing fitted coefficients in the distance function, density in the area(s) surveyed, abundance on the study area, the number of groups seen between w.lo and w.hi, the number of individuals seen between w.lo and w.hi, study area size, surveyed area, average group size, and average effective detection distance. |
B |
If confidence intervals were requested, a tibble
containing all bootstrap values of coefficients,
density, abundance, groups seen, individuals seen,
study area size, surveyed area size, average group size,
and average effective detection distance. The number of rows is always
|
ci |
Confidence level of the confidence intervals |
Rdistance's nested data frames (produced by RdistDf
)
contain all information required to estimate bootstrap CIs.
To compute bootstrap CIs, Rdistance resamples, with replacement,
the rows of the $data
component contained in Rdistance
fitted models. Rdistance assumes each row of $data
contains one information on on transect.
The $data
component also contains
information on which observations go into the
detection functions, which should be counted as detected targets,
and which count toward transect length.
After resampling rows of $data
, Rdistance
refits the distance function using non-missing distances,
recomputes the detected number of targets using non-missing
group sizes on transects with non-missing length,
and re-computes total transect length from transects
with non-missing lengths.
By default, R
= 500 bootstrap iterations are
performed, after which bias
corrected confidence intervals are computed (Manly, 1997, section 3.4).
The distance function is not re-selected during bootstrap resampling. The model of the input object is re-fitted every iteration.
During bootstrap iterations, the distance function can fail.
An iteration can fail for a two reasons:
(1) no detections on the iteration, and (2) a bad configuration
of distances that push the distance function's parameters to their
limits. When an iteration fails, Rdistance
skips the iteration and effectively ignores the
failed iterations.
If the proportion of failed iterations is small
(less than 20
is probably valid and no warning is issued. If the proportion of
non-convergent iterations
is not small (exceeds 20
The warning can be modified
by re-setting the Rdistance_maxBSFailPropForWarning
option.
Setting options(Rdistance_masBSFailPropForWarning = 1.0)
will turn
off the warning.
Setting options(Rdistance_masBSFailPropForWarning = 0.0)
will
warn if any iteration failed. Results (density and effective
sampling distance)
from all successful iterations are contained in the
non-NA rows of data frame 'B' in the output object.
Transect lengths can be missing in the RdistDf object. Missing length transects are equivalent to 0 [m] transects and do not count toward total surveyed units nor to group sizes on these transects count toward total detected individuals. Use NA-length transects to include their associated distances when estimating the distance function, but not when estimating abundance. For example, this allows estimation of abundance on one study area using off-transect distances from another. This allows sightability to be estimated using two or more similar targets (e.g., two similar species), but abundance to be estimated separate for each target type. Include NA-length transects by including the "extra" distance observations in the detection data frame, with valid site IDs, but set the length of those site IDs to NA in the site data frame.
Point transects do not have a physical measurement for length. The "length" of point transects is the number of points on the transect. Point transects can contain only one point. Rdistance treats transects of points as independent and bootstrap resamples them to estimate variance. The number of points on each point transect must exist in the RdistDf and cannot have physical measurement units (it is a count, not a distance).
Manly, B.F.J. (1997) Randomization, bootstrap, and Monte-Carlo methods in biology, London: Chapman and Hall.
Buckland, S.T., D.R. Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. (2001) Introduction to distance sampling: estimating abundance of biological populations. Oxford University Press, Oxford, UK.
dfuncEstim
, autoDistSamp
,
predict.dfunc
with 'type = "density"'.
# Load example sparrow data (line transect survey type) # sparrowDf <- RdistDf(sparrowSiteData, sparrowDetectionData) data(sparrowDf) # Fit half-normal detection function dfunc <- sparrowDf |> dfuncEstim(formula=dist ~ groupsize(groupsize) , likelihood="halfnorm" , w.hi=units::set_units(150, "m") ) # Estimate abundance - Convenient for programming abundDf <- estimateN(dfunc , area = units::set_units(4105, "km^2") ) # Same - Nicer output # Set ci=0.95 (or another value) to estimate bootstrap CI's on ESW, density, and abundance. fit <- abundEstim(dfunc , area = units::set_units(4105, "km^2") , ci = NULL )
# Load example sparrow data (line transect survey type) # sparrowDf <- RdistDf(sparrowSiteData, sparrowDetectionData) data(sparrowDf) # Fit half-normal detection function dfunc <- sparrowDf |> dfuncEstim(formula=dist ~ groupsize(groupsize) , likelihood="halfnorm" , w.hi=units::set_units(150, "m") ) # Estimate abundance - Convenient for programming abundDf <- estimateN(dfunc , area = units::set_units(4105, "km^2") ) # Same - Nicer output # Set ci=0.95 (or another value) to estimate bootstrap CI's on ESW, density, and abundance. fit <- abundEstim(dfunc , area = units::set_units(4105, "km^2") , ci = NULL )
Computes AICc, AIC, or BIC for estimated distance functions.
## S3 method for class 'dfunc' AIC(object, ..., criterion = "AICc")
## S3 method for class 'dfunc' AIC(object, ..., criterion = "AICc")
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
... |
Included for compatibility with generic |
criterion |
String specifying the criterion to compute. Either "AICc", "AIC", or "BIC". |
Regular Akaike's information criterion
(https://en.wikipedia.org/wiki/Akaike_information_criterion) () is
where is the maximized value of the log likelihood
(the minimized value of the negative log
likelihood) and
is the
number of coefficients estimated in the detection function. For
dfunc
objects, =
obj$loglik + 2*length(coef(obj))
.
A correction
for small sample size, , is
where is sample
size or number of detected groups for distance analyses. By default, this function
computes
.
converges quickly to
as
increases.
The Bayesian Information Criterion (BIC) is
.
A scalar, the requested fit statistic for object
.
Burnham, K. P., and D. R. Anderson, 2002. Model selection and multi-model inference: A practical information-theoretic approach, Second ed. Springer-Verlag. ISBN 0-387-95364-7.
McQuarrie, A. D. R., and Tsai, C.-L., 1998. Regression and time series model selection. World Scientific. ISBN 981023242X
data(sparrowDf) dfunc <- sparrowDf |> dfuncEstim(dist~1) # Fit statistics AIC(dfunc) # AICc AIC(dfunc, criterion="AIC") # AIC AIC(dfunc, criterion="BIC") # BIC
data(sparrowDf) dfunc <- sparrowDf |> dfuncEstim(dist~1) # Fit statistics AIC(dfunc) # AICc AIC(dfunc, criterion="AIC") # AIC AIC(dfunc, criterion="BIC") # BIC
Perform automated likelihood, expansion, and series selection for a classic distance sampling analysis. Estimate abundance using the best fitting likelihood, expansion, and series.
autoDistSamp( data, formula, likelihoods = c("halfnorm", "hazrate", "negexp"), w.lo = units::set_units(0, "m"), w.hi = NULL, expansions = 0:3, series = c("cosine"), x.scl = w.lo, g.x.scl = 1, warn = TRUE, outputUnits = NULL, area = NULL, propUnitSurveyed = 1, ci = 0.95, R = 500, plot.bs = FALSE, showProgress = TRUE, plot = TRUE, criterion = "AICc" )
autoDistSamp( data, formula, likelihoods = c("halfnorm", "hazrate", "negexp"), w.lo = units::set_units(0, "m"), w.hi = NULL, expansions = 0:3, series = c("cosine"), x.scl = w.lo, g.x.scl = 1, warn = TRUE, outputUnits = NULL, area = NULL, propUnitSurveyed = 1, ci = 0.95, R = 500, plot.bs = FALSE, showProgress = TRUE, plot = TRUE, criterion = "AICc" )
data |
An |
formula |
A standard formula object. For example, |
likelihoods |
String vector specifying the likelihoods to fit.
See 'likelihood' parameter of |
w.lo |
Lower or left-truncation limit of the distances in distance data.
This is the minimum possible off-transect distance. Default is 0. If
|
w.hi |
Upper or right-truncation limit of the distances
in |
expansions |
A scalar specifying the number of terms
in |
series |
If |
x.scl |
The x coordinate (a distance) at which the
detection function will be scaled. |
g.x.scl |
Height of the distance function at coordinate |
warn |
A logical scalar specifying whether to issue
an R warning if the estimation did not converge or if one
or more parameter estimates are at their boundaries.
For estimation, |
outputUnits |
A string specifying the symbolic measurement
units for results. Valid units are listed in |
area |
A scalar containing the total area of inference. Usually, this is
study area size. If |
propUnitSurveyed |
A scalar or vector of real numbers between 0 and 1.
The proportion of the default sampling unit that
was surveyed. If both sides of line transects were observed,
|
ci |
A scalar indicating the confidence level of confidence intervals.
Confidence intervals are computed using a bias corrected bootstrap
method. If |
R |
The number of bootstrap iterations to conduct when |
plot.bs |
A logical scalar indicating whether to plot individual bootstrap iterations. |
showProgress |
A logical indicating whether to show a text-based
progress bar during bootstrapping. Default is |
plot |
Logical scalar specifying whether to plot models during model selection.
If |
criterion |
A string specifying the criterion to use when assessing model fit.
The best fitting model, as defined by this routine, has the lowest value
of this criterion. This must be one of "AICc" (the default),
"AIC", or "BIC". See |
During distance function selection, all combinations of likelihoods, series, and
number of expansions is fitted. For example, if likelihoods
has 3 elements,
series
has 2 elements, and expansions
has 4 elements,
this routine fits a total of 3 (likelihoods) * 2 (series) * 4 (expansions)
= 24 models. Default parameters fit 9 detection functions, i.e.,
all combinations of "halfnorm", "hazrate", and "negexp" likelihoods
and 0 through 3 expansions. Other combinations are specified through
values of likelihoods
, series
, and expansions
.
Suppress all intermediate output using plot.bs=FALSE
,
showProgress=FALSE
, and plot=FALSE
.
The returned abundance estimate object contains
an additional component, the fitting table (a list of models fitted and
criterion values) in component $fitTable
.
An Rdistance 'abundance estimate' object, which is a list of
class c("abund", "dfunc")
, containing all the components of a "dfunc"
object (see dfuncEstim
), plus the following:
estimates |
A tibble containing fitted coefficients in the distance function, density in the area(s) surveyed, abundance on the study area, the number of groups seen between w.lo and w.hi, the number of individuals seen between w.lo and w.hi, study area size, surveyed area, average group size, and average effective detection distance. |
B |
If confidence intervals were requested, a tibble
containing all bootstrap values of coefficients,
density, abundance, groups seen, individuals seen,
study area size, surveyed area size, average group size,
and average effective detection distance. The number of rows is always
|
ci |
Confidence level of the confidence intervals |
# Load example sparrow data (line transect survey type) data(sparrowDf) autoDistSamp(data = sparrowDf , formula = dist ~ groupsize(groupsize) , likelihoods = c("halfnorm","negexp") , expansions = 0 , plot = FALSE , ci = NULL , area = units::set_units(1, "hectare") ) ## Not run: autoDistSamp(data = sparrowDf , formula = dist ~ 1 + groupsize(groupsize) , ci = 0.95 , area = units::set_units(1, "hectare") ) ## End(Not run)
# Load example sparrow data (line transect survey type) data(sparrowDf) autoDistSamp(data = sparrowDf , formula = dist ~ groupsize(groupsize) , likelihoods = c("halfnorm","negexp") , expansions = 0 , plot = FALSE , ci = NULL , area = units::set_units(1, "hectare") ) ## Not run: autoDistSamp(data = sparrowDf , formula = dist ~ 1 + groupsize(groupsize) , ci = 0.95 , area = units::set_units(1, "hectare") ) ## End(Not run)
Calculate bias-corrected confidence intervals for bootstrap data using methods in Manly textbook.
bcCI(x.bs, x, ci = 0.95)
bcCI(x.bs, x, ci = 0.95)
x.bs |
A vector of bootstrap estimates of some quantity. |
x |
A scalar of the original estimate of the quantity. |
ci |
A scalar of the desired confidence interval coverage. |
A named vector containing the lower and upper endpoints of the bias-corrected bootstrap confidence interval.
Check that number of integration intervals is odd and sufficiently large.
checkNEvalPts(nEvalPts)
checkNEvalPts(nEvalPts)
nEvalPts |
An integer to check. |
The first element of nEvalPts is returned if it is acceptable. If nEvalPts is not acceptable, an error is thrown.
Check for the presence of physical measurement units on key
columns of
an RdistDf
data frame.
checkUnits(ml)
checkUnits(ml)
ml |
An |
The input ml
list, with units of various
quantities converted to common units. If a check fails, for example,
a quantity does not have units, an error is thrown.
Extract distance model coefficients from an estimated detection function object.
## S3 method for class 'dfunc' coef(object, ...)
## S3 method for class 'dfunc' coef(object, ...)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
... |
Ignored |
The estimated coefficient vector for the detection function. Length and interpretation of values vary depending on the form of the detection function and expansion terms.
data(sparrowDfuncObserver) # pre-estimated dfunc # Same as sparrowDfuncObserver$par coef(sparrowDfuncObserver) ## Not run: data(sparrowDf) dfunc <- sparrowDf |> dfuncEstim(dist~bare + observer, w.hi=units::set_units(150, "m")) coef(dfunc) ## End(Not run)
data(sparrowDfuncObserver) # pre-estimated dfunc # Same as sparrowDfuncObserver$par coef(sparrowDfuncObserver) ## Not run: data(sparrowDf) dfunc <- sparrowDf |> dfuncEstim(dist~bare + observer, w.hi=units::set_units(150, "m")) coef(dfunc) ## End(Not run)
Add ANSI color to a string using the
crayon
package, if the R environment accepts color.
This function is needed because of the need to determine whether
output can be colorized. This determination is left up to
crayon::has_color()
.
In addition, for Rdistance results, we want to only colorize
numbers, not the reporting units. Everything between the last set
of square brackets ([...]
) is NOT colorized.
colorize(STR, col = NULL, bg = NULL)
colorize(STR, col = NULL, bg = NULL)
STR |
The string to colorize. |
col |
A string specifying the desired foreground color.
This is passed straight to |
bg |
A string specifying the desired background color. Must be one of "bgBlack", "bgRed", "bgGreen", "bgYellow", "bgBlue" "bgMagenta", "bgCyan", or "bgWhite". By default, no background is applied. |
If color is not allowed in the terminal, the input string is returned unperturbed. If color is allowed, the input string is returned with color and background ANSI code surrounding the initial part of the string from character 1 to the character before the [ in the last pair of [].
crayon::style
Computes the cosine expansion terms used to modify the shape of distance likelihood functions.
cosine.expansion(x, expansions)
cosine.expansion(x, expansions)
x |
A numeric vector of distances at which to evaluate
the expansion series. For distance analysis, |
expansions |
A scalar specifying the number of expansion terms to compute. Must be one of the integers 1, 2, 3, 4, or 5. |
There are, in general, several expansions that can be called cosine. The cosine expansion used here is:
First term:
Second term:
Third term:
Fourth term:
Fifth term:
The maximum number of expansion terms computed is 5.
A matrix of size length(x)
X expansions
. The columns of this matrix are the cosine expansions of
x
. Column 1 is the first expansion term of x
, column 2 is the second expansion term of x
, and so on
up to expansions
.
dfuncEstim
, hermite.expansion
, simple.expansion
, and the discussion
of user defined likelihoods in dfuncEstim
.
x <- seq(0, 1, length = 200) cos.expn <- cosine.expansion(x, 5) plot(range(x), range(cos.expn), type="n") matlines(x, cos.expn, col=rainbow(5), lty = 1)
x <- seq(0, 1, length = 200) cos.expn <- cosine.expansion(x, 5) plot(range(x), range(cos.expn), type="n") matlines(x, cos.expn, col=rainbow(5), lty = 1)
Fits a detection function to off-transect distances collected by multiple observers.
dE.multi( data, formula, likelihood = "halfnorm", w.lo = units::set_units(0, "m"), w.hi = NULL, expansions = 0, series = "cosine", x.scl = units::set_units(0, "m"), g.x.scl = 1, warn = TRUE, outputUnits = NULL )
dE.multi( data, formula, likelihood = "halfnorm", w.lo = units::set_units(0, "m"), w.hi = NULL, expansions = 0, series = "cosine", x.scl = units::set_units(0, "m"), g.x.scl = 1, warn = TRUE, outputUnits = NULL )
data |
An |
formula |
A standard formula object. For example, |
likelihood |
String specifying the likelihood to fit. Built-in likelihoods at present are "halfnorm", "hazrate", and "negexp". |
w.lo |
Lower or left-truncation limit of the distances in distance data.
This is the minimum possible off-transect distance. Default is 0. If
|
w.hi |
Upper or right-truncation limit of the distances
in |
expansions |
A scalar specifying the number of terms
in |
series |
If |
x.scl |
The x coordinate (a distance) at which the
detection function will be scaled. |
g.x.scl |
Height of the distance function at coordinate |
warn |
A logical scalar specifying whether to issue
an R warning if the estimation did not converge or if one
or more parameter estimates are at their boundaries.
For estimation, |
outputUnits |
A string specifying the symbolic measurement
units for results. Valid units are listed in |
An object of class 'dfunc'. Objects of class 'dfunc' are lists containing the following components:
par |
The vector of estimated parameter values. Length of this vector for built-in likelihoods is one (for the function's parameter) plus the number of expansion terms plus one if the likelihood is 'hazrate' (which has two parameters). |
varcovar |
The variance-covariance matrix for coefficients
of the distance function, estimated by the inverse of the fit's Hessian
evaluated at the estimates. Rdistance estimates the
Hessian as the second derivative of the log likelihood surface
at the final estimates, where second derivatives are estimated by
numeric differentiation (see |
loglik |
The maximized value of the log likelihood. |
convergence |
The convergence code. This code
is returned by |
likelihood |
The name of the likelihood. This is
the value of the argument |
w.lo |
Left-truncation value used during the fit. |
w.hi |
Right-truncation value used during the fit. |
mf |
A modelframe of detections within the strip
or circle used in the fit. Column 'dist' contains the
observed distances.
Column 'offset(...)' contains group sizes associated with
the values of 'dist'. Group
sizes are only used in |
model.frame |
A |
siteID.cols |
A vector containing the transect ID column names in |
expansions |
The number of expansion terms used during estimation. |
series |
The type of expansion used during estimation. |
call |
The original call of this function. |
call.x.scl |
The input or user requested distance at which the distance function is scaled. |
call.g.x.scl |
The |
call.observer |
The value of input parameter |
fit |
The fitted object returned by |
factor.names |
The names of any factors in |
pointSurvey |
The input value of |
formula |
The formula specified for the detection function. |
control |
A list containing values of the 'control' parameters
set by |
outputUnits |
The measurement units used for output. All distance measurements are converted to these units internally. |
x.scl |
The actual distance at which
the distance function is scaled to some value.
i.e., this is the actual x at
which g(x) = |
g.x.scl |
The actual height of the distance function
at a distance of |
Fits a detection function to off-transect distances collected by a single observer.
dE.single( data, formula, likelihood = "halfnorm", w.lo = units::set_units(0, "m"), w.hi = NULL, expansions = 0, series = "cosine", x.scl = w.lo, g.x.scl = 1, warn = TRUE, outputUnits = NULL )
dE.single( data, formula, likelihood = "halfnorm", w.lo = units::set_units(0, "m"), w.hi = NULL, expansions = 0, series = "cosine", x.scl = w.lo, g.x.scl = 1, warn = TRUE, outputUnits = NULL )
data |
An |
formula |
A standard formula object. For example, |
likelihood |
String specifying the likelihood to fit. Built-in likelihoods at present are "halfnorm", "hazrate", and "negexp". |
w.lo |
Lower or left-truncation limit of the distances in distance data.
This is the minimum possible off-transect distance. Default is 0. If
|
w.hi |
Upper or right-truncation limit of the distances
in |
expansions |
A scalar specifying the number of terms
in |
series |
If |
x.scl |
The x coordinate (a distance) at which the
detection function will be scaled. |
g.x.scl |
Height of the distance function at coordinate |
warn |
A logical scalar specifying whether to issue
an R warning if the estimation did not converge or if one
or more parameter estimates are at their boundaries.
For estimation, |
outputUnits |
A string specifying the symbolic measurement
units for results. Valid units are listed in |
Optimization and estimation controls can be modified using options()
.
See RdistanceControls
.
An object of class 'dfunc'. Objects of class 'dfunc' are lists containing the following components:
par |
The vector of estimated parameter values. Length of this vector for built-in likelihoods is one (for the function's parameter) plus the number of expansion terms plus one if the likelihood is 'hazrate' (which has two parameters). |
varcovar |
The variance-covariance matrix for coefficients
of the distance function, estimated by the inverse of the fit's Hessian
evaluated at the estimates. Rdistance estimates the
Hessian as the second derivative of the log likelihood surface
at the final estimates, where second derivatives are estimated by
numeric differentiation (see |
loglik |
The maximized value of the log likelihood. |
convergence |
The convergence code. This code
is returned by |
likelihood |
The name of the likelihood. This is
the value of the argument |
w.lo |
Left-truncation value used during the fit. |
w.hi |
Right-truncation value used during the fit. |
mf |
A modelframe of detections within the strip
or circle used in the fit. Column 'dist' contains the
observed distances.
Column 'offset(...)' contains group sizes associated with
the values of 'dist'. Group
sizes are only used in |
model.frame |
A |
siteID.cols |
A vector containing the transect ID column names in |
expansions |
The number of expansion terms used during estimation. |
series |
The type of expansion used during estimation. |
call |
The original call of this function. |
call.x.scl |
The input or user requested distance at which the distance function is scaled. |
call.g.x.scl |
The |
call.observer |
The value of input parameter |
fit |
The fitted object returned by |
factor.names |
The names of any factors in |
pointSurvey |
The input value of |
formula |
The formula specified for the detection function. |
control |
A list containing values of the 'control' parameters
set by |
outputUnits |
The measurement units used for output. All distance measurements are converted to these units internally. |
x.scl |
The actual distance at which
the distance function is scaled to some value.
i.e., this is the actual x at
which g(x) = |
g.x.scl |
The actual height of the distance function
at a distance of |
To specify non-unity group sizes, use groupsize()
on the RHS of formula
. When group sizes are not all 1, they must appear in a column
of the 'detections' list-column of data
.
For example, d ~ habitat + groupsize(number)
specifies
distances in column d
, one covariate
named habitat
, and that column number
contains the number of individuals
associated with each detection. If group sizes are not specified,
all group sizes are assumed to be 1.
Factor contrasts in Rdistance
are specified
the same way as in lm
or glm
.
By default, Rdistance
uses
contrasts in getOption("contrasts")
. To change contrasts, use a statement
like options(contrasts = c(unordered = "contr.SAS",
ordered = "contr.poly"))
. Or, to set contrasts for a
specific factor in the input data frame, use
contrasts(df$A) <- "contr.sum"
or similar.
See contrasts
or the contrasts.arg
of model.matrix
.
Rdistance
accommodates two kinds of transects: continuous and point.
Detections can occur at any point on continuous transects.
Rdistance
calls these 'line-transects' even though routes are not
necessarily a straight line.
On point transects, detections occur at a series of stops
(points). Rdisance
calls these point-transects. Transects are the basic
sampling unit in both cases. Rdistance
assumes each row of data
contains information from one transect. See RdistDf
for
more details.
As of Rdistance
version 3.0.0, measurement units are
require on all physical distances.
Requiring units ensures that internal calculations and results
(e.g., ESW and abundance) are correct
and that output units are clear.
Physical distances are required on
off-transect distances, radial distances, truncation distances
(w.lo
, unless it is zero; and w.hi
, unless it is NULL),
scale locations (x.scl
, unless it is zero),
line-transect lengths, and study area size. All units are
1-dimensional except those on study area, which are 2-dimensional.
Physical measurement units can vary. For example,
off-transect distances can be meters ("m"), w.hi
can be inches ("in"),
and w.lo
can be kilometers ("km"). Internally, all distances are
converted to the units specified by outputUnits
(or the units of input distances if
outputUnits
is NULL), and
all output is reported
in units of outputUnits
. Valid conversions must exist between
units or an error is thrown. For example, meters cannot be converted
into hectares.
Measurement units can be assigned using
units()<-
after attaching the units
package or with x <- units::set_units(x, "<units>")
.
See units::valid_udunits()
for a list of valid symbolic units.
If measurements are truly unit-less, or measurement units are unknown,
set options(Rdist_requireUnits = FALSE)
. This suppresses
all unit checks and conversions. Users are on their own
to make sure inputs are scaled correctly and that output units are known.
Buckland, S.T., D.R. Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. (2001) Introduction to distance sampling: estimating abundance of biological populations. Oxford University Press, Oxford, UK.
abundEstim
, autoDistSamp
.
Likelihood-specific help files (e.g., halfnorm.like
).
# Load example sparrow data (line transect survey type) data(sparrowDf) dfunc <- dfuncEstim(data = sparrowDf , formula = dist ~ 1) dfunc plot(dfunc)
# Load example sparrow data (line transect survey type) data(sparrowDf) dfunc <- dfuncEstim(data = sparrowDf , formula = dist ~ 1) dfunc plot(dfunc)
Fits a detection function using maximum likelihood.
dfuncEstim(data, ...)
dfuncEstim(data, ...)
data |
An |
... |
Arguments passed on to
|
Optimization and estimation controls can be modified using options()
.
See RdistanceControls
.
An object of class 'dfunc'. Objects of class 'dfunc' are lists containing the following components:
par |
The vector of estimated parameter values. Length of this vector for built-in likelihoods is one (for the function's parameter) plus the number of expansion terms plus one if the likelihood is 'hazrate' (which has two parameters). |
varcovar |
The variance-covariance matrix for coefficients
of the distance function, estimated by the inverse of the fit's Hessian
evaluated at the estimates. Rdistance estimates the
Hessian as the second derivative of the log likelihood surface
at the final estimates, where second derivatives are estimated by
numeric differentiation (see |
loglik |
The maximized value of the log likelihood. |
convergence |
The convergence code. This code
is returned by |
likelihood |
The name of the likelihood. This is
the value of the argument |
w.lo |
Left-truncation value used during the fit. |
w.hi |
Right-truncation value used during the fit. |
mf |
A modelframe of detections within the strip
or circle used in the fit. Column 'dist' contains the
observed distances.
Column 'offset(...)' contains group sizes associated with
the values of 'dist'. Group
sizes are only used in |
model.frame |
A |
siteID.cols |
A vector containing the transect ID column names in |
expansions |
The number of expansion terms used during estimation. |
series |
The type of expansion used during estimation. |
call |
The original call of this function. |
call.x.scl |
The input or user requested distance at which the distance function is scaled. |
call.g.x.scl |
The |
call.observer |
The value of input parameter |
fit |
The fitted object returned by |
factor.names |
The names of any factors in |
pointSurvey |
The input value of |
formula |
The formula specified for the detection function. |
control |
A list containing values of the 'control' parameters
set by |
outputUnits |
The measurement units used for output. All distance measurements are converted to these units internally. |
x.scl |
The actual distance at which
the distance function is scaled to some value.
i.e., this is the actual x at
which g(x) = |
g.x.scl |
The actual height of the distance function
at a distance of |
To specify non-unity group sizes, use groupsize()
on the RHS of formula
. When group sizes are not all 1, they must appear in a column
of the 'detections' list-column of data
.
For example, d ~ habitat + groupsize(number)
specifies
distances in column d
, one covariate
named habitat
, and that column number
contains the number of individuals
associated with each detection. If group sizes are not specified,
all group sizes are assumed to be 1.
Factor contrasts in Rdistance
are specified
the same way as in lm
or glm
.
By default, Rdistance
uses
contrasts in getOption("contrasts")
. To change contrasts, use a statement
like options(contrasts = c(unordered = "contr.SAS",
ordered = "contr.poly"))
. Or, to set contrasts for a
specific factor in the input data frame, use
contrasts(df$A) <- "contr.sum"
or similar.
See contrasts
or the contrasts.arg
of model.matrix
.
As of Rdistance
version 3.0.0, measurement units are
require on all physical distances.
Requiring units ensures that internal calculations and results
(e.g., ESW and abundance) are correct
and that output units are clear.
Physical distances are required on
off-transect distances, radial distances, truncation distances
(w.lo
, unless it is zero; and w.hi
, unless it is NULL),
scale locations (x.scl
, unless it is zero),
line-transect lengths, and study area size. All units are
1-dimensional except those on study area, which are 2-dimensional.
Physical measurement units can vary. For example,
off-transect distances can be meters ("m"), w.hi
can be inches ("in"),
and w.lo
can be kilometers ("km"). Internally, all distances are
converted to the units specified by outputUnits
(or the units of input distances if
outputUnits
is NULL), and
all output is reported
in units of outputUnits
. Valid conversions must exist between
units or an error is thrown. For example, meters cannot be converted
into hectares.
Measurement units can be assigned using
units()<-
after attaching the units
package or with x <- units::set_units(x, "<units>")
.
See units::valid_udunits()
for a list of valid symbolic units.
If measurements are truly unit-less, or measurement units are unknown,
set options(Rdist_requireUnits = FALSE)
. This suppresses
all unit checks and conversions. Users are on their own
to make sure inputs are scaled correctly and that output units are known.
Buckland, S.T., D.R. Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. (2001) Introduction to distance sampling: estimating abundance of biological populations. Oxford University Press, Oxford, UK.
abundEstim
, autoDistSamp
.
Likelihood-specific help files (e.g., halfnorm.like
).
# Sparrow line transect example data(sparrowDetectionData) data(sparrowSiteData) sparrowDf <- RdistDf(sparrowSiteData, sparrowDetectionData) dfunc <- dfuncEstim(sparrowDf, formula = dist ~ 1 ) summary(dfunc) data(sparrowDfuncObserver) # pre-estimated object ## Not run: # Command to produce 'sparrowDfuncObserver' sparrowDfuncObserver <- sparrowDf |> dfuncEstim( formula = dist ~ observer ) ## End(Not run) sparrowDfuncObserver summary(sparrowDfuncObserver) plot(sparrowDfuncObserver)
# Sparrow line transect example data(sparrowDetectionData) data(sparrowSiteData) sparrowDf <- RdistDf(sparrowSiteData, sparrowDetectionData) dfunc <- dfuncEstim(sparrowDf, formula = dist ~ 1 ) summary(dfunc) data(sparrowDfuncObserver) # pre-estimated object ## Not run: # Command to produce 'sparrowDfuncObserver' sparrowDfuncObserver <- sparrowDf |> dfuncEstim( formula = dist ~ observer ) ## End(Not run) sparrowDfuncObserver summary(sparrowDfuncObserver) plot(sparrowDfuncObserver)
Utility function to produce error messages suitable for stop
dfuncEstimErrMessage(txt, attri)
dfuncEstimErrMessage(txt, attri)
txt |
A text string describing the error. |
attri |
An attribute to report. |
A string
Extract the observation distances (i.e., responses for an Rdistance model) from an Rdistance model frame.
distances(ml, na.rm = TRUE, ...)
distances(ml, na.rm = TRUE, ...)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
na.rm |
Whether to include or exclude missing distance values.
In |
... |
Ignored |
A vector containing observation distances contained in the Rdistance model frame.
data(sparrowDf) sparrowModel <- parseModel( sparrowDf, dist ~ observer ) stats::model.response(sparrowModel$mf) distances(sparrowModel) # same, but future-proof
data(sparrowDf) sparrowModel <- parseModel( sparrowDf, dist ~ observer ) stats::model.response(sparrowModel$mf) distances(sparrowModel) # same, but future-proof
Computes Effective Detection Radius (EDR) for estimated
detection functions on point transects.
See ESW
is for line transects.
EDR(object, newdata = NULL)
EDR(object, newdata = NULL)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
newdata |
A data frame containing new values of
the covariates at which to evaluate the distance functions.
If |
Effective Detection Radius is the integral under the detection function times distance.
If newdata
is present, the returned value is
a vector of effective sampling distances for values of the
covariates in newdata
with length equal to
the number of rows in newdata
.
If newdata
is NULL, the returned value is a vector of effective
sampling distances associated with covariate values in object
and has
the same number of detected groups. The returned vector
has measurement units, i.e., object$outputUnits
.
Rdistance uses Simpson's composite 1/3 rule to numerically
integrate under distance functions. The number of points evaluated
during numerical integration is controlled by
options(Rdistance_intEvalPts)
(default 101).
Option 'Rdistance_intEvalPts' must be odd because Simpson's rule
requires an even number of intervals (hence, odd number of points).
Lower values of 'Rdistance_intEvalPts' increase calculation speeds;
but, decrease accuracy.
'Rdistance_intEvalPts' must be >= 5. A warning is thrown if
'Rdistance_intEvalPts' < 29. Empirical tests by the author
suggest 'Rdistance_intEvalPts' values >= 30 are accurate
to several decimal points and that all 'Rdistance_intEvalPts' >= 101 produce
identical results in all but pathological cases.
dfuncEstim
, ESW
,
effectiveDistance
# Load example thrasher data (point transect survey type) data(thrasherDf) # Fit half-normal detection function dfunc <- thrasherDf |> dfuncEstim(formula=dist~bare) # Compute effective detection radius (EDR) EDR(dfunc) # vector length 192 effectiveDistance(dfunc) # same EDR(dfunc, newdata = data.frame(bare=30)) # vector length 1
# Load example thrasher data (point transect survey type) data(thrasherDf) # Fit half-normal detection function dfunc <- thrasherDf |> dfuncEstim(formula=dist~bare) # Compute effective detection radius (EDR) EDR(dfunc) # vector length 192 effectiveDistance(dfunc) # same EDR(dfunc, newdata = data.frame(bare=30)) # vector length 1
Computes Effective Strip Width (ESW) for line-transect detection functions, or the analogous Effective Detection Radius (EDR) for point-transect detection functions.
effectiveDistance(object, newdata = NULL)
effectiveDistance(object, newdata = NULL)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
newdata |
A data frame containing new values for
covariates at which either
ESW's or EDR's will be computed. If NULL and
|
Serves as a wrapper for
ESW
and EDR
.
If newdata
is present, the returned value is
a vector of effective sampling distances for values of the
covariates in newdata
with length equal to
the number of rows in newdata
.
If newdata
is NULL, the returned value is a vector of effective
sampling distances associated with covariate values in object
and has
the same number of detected groups. The returned vector
has measurement units, i.e., object$outputUnits
.
Extract effort information from an Rdistance data frame. Effort is length of line-transects or number of points on point-transects.
effort(x, ...)
effort(x, ...)
x |
Either an estimated distance function, output by
|
... |
Ignored |
A vector containing effort. If line-transects, return is length of transects, with units. If point-transects, return is number of points (integers, no units). Vector length is number of transects. If input is not an RdistDf or estimated distance function, return is NULL.
data(sparrowDf) effort(sparrowDf) fit <- dfuncEstim(sparrowDf, dist ~ 1) effort(fit)
data(sparrowDf) effort(sparrowDf) fit <- dfuncEstim(sparrowDf, dist ~ 1) effort(fit)
Constructs a string stating what is "unknown" that is suitable for use in warning and error functions.
errDataUnk(txt, attri)
errDataUnk(txt, attri)
txt |
Text. The "unknown" we are looking for. |
attri |
Attribute description we are looking for. |
A descriptive string, suitable for warning or error.
Estimate abundance from an Rdistance fitted model.
This function is called internally by abundEstim
. Most users will call
abundEstim
to estimate abundance unless they are running simulations or
bootstrapping.
estimateN(object, area = NULL, propUnitSurveyed = 1)
estimateN(object, area = NULL, propUnitSurveyed = 1)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
area |
A scalar containing the total area of inference. Usually, this is
study area size. If |
propUnitSurveyed |
A scalar or vector of real numbers between 0 and 1.
The proportion of the default sampling unit that
was surveyed. If both sides of line transects were observed,
|
The abundance estimate for line-transect surveys (if no covariates are included in the detection function and both sides of the transect are observed) is
where n is total number of sighted individuals
(i.e., sum(groupSizes(dfunc))
), L is the total length of
surveyed transect (i.e., sum(effort(dfunc))
),
and ESW is effective strip width
computed from the estimated distance function (i.e., ESW(dfunc)
).
If only one side of transects were observed, the "2" in the denominator
is not present (or, replaced with a "1").
The abundance estimate for point transect surveys (if no covariates are included) is
where n is total number of sighted individuals (i.e., sum(groupSizes(dfunc))
),
P is the total number of surveyed points (i.e., sum(effort(dfunc))
),
and ESR is effective search radius
computed from the estimated distance function (i.e., ESR(dfunc)
).
Setting plot.bs=FALSE
and showProgress=FALSE
suppresses all intermediate output.
Estimation of site-specific density (e.g., on every transect) is accomplished by
predict(x, type = "density")
, which returns a
tibble containing density and abundance on the area surveyed by every
transect.
A list containing the following components:
density |
Estimated density in the surveyed area. |
abundance |
Estimated abundance on the study area. Equals density if area is not specified. |
n.groups |
The number of detected groups (not individuals, unless all group sizes = 1). |
n.seen |
The number of individuals (sum of group sizes). |
area |
Total area of inference. Study area size |
surveyedUnits |
Number of surveyed sites. This is total transect length for line-transects or number of points for point-transects. This total transect length does not include transects with missing lengths. |
propUnitSurveyed |
Proportion of the standard survey unit that was observed |
avg.group.size |
Average group size on non-NA transects |
w |
Strip width. |
pDetection |
Probability of detection. |
For line-transects that do not involve covariates, object$density is object$n.seen / (2 * propUnitSurveyed * object$w * object$pDetection * object$surveyedUnits)
Returns effective strip width (ESW) for
line-transect detection functions.
See EDR
is for point transects.
ESW(object, newdata = NULL)
ESW(object, newdata = NULL)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
newdata |
A data frame containing new values for
covariates at which either
ESW's or EDR's will be computed. If NULL and
|
ESW is the area under
the scaled distance function between its
left-truncation limit (obj$w.lo
) and its right-truncation
limit (obj$w.hi
).
If detection does not decline with distance,
the detection function is flat (horizontal), and
area under the detection
function is .
If, in this case,
, effective sampling distance is
the half-width of the surveys,
If newdata
is present, the returned value is
a vector of effective sampling distances for values of the
covariates in newdata
with length equal to
the number of rows in newdata
.
If newdata
is NULL, the returned value is a vector of effective
sampling distances associated with covariate values in object
and has
the same number of detected groups. The returned vector
has measurement units, i.e., object$outputUnits
.
Rdistance uses Simpson's composite 1/3 rule to numerically
integrate under distance functions. The number of points evaluated
during numerical integration is controlled by
options(Rdistance_intEvalPts)
(default 101).
Option 'Rdistance_intEvalPts' must be odd because Simpson's rule
requires an even number of intervals (hence, odd number of points).
Lower values of 'Rdistance_intEvalPts' increase calculation speeds;
but, decrease accuracy.
'Rdistance_intEvalPts' must be >= 5. A warning is thrown if
'Rdistance_intEvalPts' < 29. Empirical tests by the author
suggest 'Rdistance_intEvalPts' values >= 30 are accurate
to several decimal points and that all 'Rdistance_intEvalPts' >= 101 produce
identical results in all but pathological cases.
dfuncEstim
, EDR
,
effectiveDistance
data(sparrowDf) dfunc <- sparrowDf |> dfuncEstim(formula=dist~bare) ESW(dfunc) # vector length 356 = number of groups ESW(dfunc, newdata = data.frame(bare = c(30,40))) # vector length 2
data(sparrowDf) dfunc <- sparrowDf |> dfuncEstim(formula=dist~bare) ESW(dfunc) # vector length 356 = number of groups ESW(dfunc, newdata = data.frame(bare = c(30,40))) # vector length 2
Compute "expansion" terms that modify the shape of a base distance function.
expansionTerms(a, d, series, nexp, w)
expansionTerms(a, d, series, nexp, w)
a |
A vector or matrix of (estimated) coefficients.
|
d |
A vector or 1-column matrix of
distances at which to evaluate
the expansion terms. |
series |
If |
nexp |
Number of expansion terms. Integer from 0 to 5. |
w |
Strip width, i.e., |
Expansion terms modify the "key" function of the
likelihood manipulatively. The modified distance function is,
key * expTerms
where key
is a vector of values in
the base likelihood function (e.g., halfnorm.like()$L.unscaled
or hazrate.like()$L.unscaled
) and expTerms
is the
matrix returned by this routine.
Let the number of expansions (nexp
) be (
> 0),
assume the raw cyclic expansion terms of
series
are for
the
expansion of distance
, and that
are (estimated)
coefficients for the expansion terms, then the likelihood contribution
for the
distance
is,
If nexp
equals 0, 1 is returned.
If nexp
is greater than 0, a matrix of
size X
containing expansion terms,
where
=
length(d)
and =
nrow(a)
. The
expansion series associated with row of
a
are in column of the return. i.e.,
element (
,
) of the return is
(see Details).
a1 <- c(log(40), 0.5, -.5) a2 <- c(log(40), 0.25, -.5) dists <- units::set_units(seq(0, 100, length = 100), "m") w = units::set_units(100, "m") expTerms1 <- expansionTerms(a1, dists, "cosine", 2, w) expTerms2 <- expansionTerms(a2, dists, "cosine", 2, w) plot(dists, expTerms2, ylim = c(0,2.5)) points(dists, expTerms1, pch = 16) # Same as above a <- rbind(a1, a2) expTerms <- expansionTerms(a, dists, "cosine", 2, w) matlines(dists, expTerms, lwd=2, col=c("red", "blue"), lty=1) # Showing key and expansions key <- halfnorm.like(log(40), dists, 1)$L.unscaled plot(dists, key, type = "l", col = "blue", ylim=c(0,1.5)) lines(dists, key * expTerms1, col = "red") lines(dists, key * expTerms2, col = "purple")
a1 <- c(log(40), 0.5, -.5) a2 <- c(log(40), 0.25, -.5) dists <- units::set_units(seq(0, 100, length = 100), "m") w = units::set_units(100, "m") expTerms1 <- expansionTerms(a1, dists, "cosine", 2, w) expTerms2 <- expansionTerms(a2, dists, "cosine", 2, w) plot(dists, expTerms2, ylim = c(0,2.5)) points(dists, expTerms1, pch = 16) # Same as above a <- rbind(a1, a2) expTerms <- expansionTerms(a, dists, "cosine", 2, w) matlines(dists, expTerms, lwd=2, col=c("red", "blue"), lty=1) # Showing key and expansions key <- halfnorm.like(log(40), dists, 1)$L.unscaled plot(dists, key, type = "l", col = "blue", ylim=c(0,1.5)) lines(dists, key * expTerms1, col = "red") lines(dists, key * expTerms2, col = "purple")
Extract the group size information from an Rdistance model frame.
groupSizes(ml, ...)
groupSizes(ml, ...)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
... |
Ignored |
A vector containing group sizes contained in the Rdistance model frame or fitted object.
data("sparrowDf") sparrowModel <- parseModel( sparrowDf, dist ~ observer ) stats::model.offset(sparrowModel$mf) groupSizes(sparrowModel) # same, but future-proof sparrowModel <- parseModel( sparrowDf , dist ~ observer + groupsize(groupsize) ) groupSizes(sparrowModel)
data("sparrowDf") sparrowModel <- parseModel( sparrowDf, dist ~ observer ) stats::model.offset(sparrowModel$mf) groupSizes(sparrowModel) # same, but future-proof sparrowModel <- parseModel( sparrowDf , dist ~ observer + groupsize(groupsize) ) groupSizes(sparrowModel)
Estimate distance function scaling factor , g(0) or g(x), for a specified distance function.
gxEstim(fit)
gxEstim(fit)
fit |
An estimated |
This routine scales sightability such that
g(x.scl
) = g.x.scl
, where g() is the sightability function.
Specification of x.scl
and g.x.scl
covers several estimation cases:
g(0) = 1 : (the default) Inputs are x.scl
= 0, g.x.scl
= 1.
If w.lo
> 0, x.scl
will be set to w.lo
so technically this case is g(w.low
) = 1.
User supplied probability at specified distance: Inputs are
x.scl
= a number greater than or equal
to w.lo
, g.x.scl
= a number between 0 and 1. This case
covers situations where sightability on the transect (distance 0) is
not perfect. This case assumes researchers have an independent
estimate of sightability at distance
x.scl
off the transect. For example, researchers could be
using multiple
observers to estimate that sightability at distance x.scl
is g.x.scl
.
Maximum sightability specified: Inputs
are x.scl
="max", g.x.scl
= a number
between 0 and 1. In this case,
g() is scaled such that its maximum value is g.x.scl
.
This routine computes the distance at which g() is maximum, sets
g()'s height there to g.x.scl
, and returns x.max
where
x.max is the distance at which g is maximized. This case covers the
common aerial survey situation where maximum sightability is slightly
off the transect, but the distance at which the maximum occurs
is unknown.
Double observer system: Inputs are
x.scl
="max", g.x.scl
= <a data frame>.
In this case, g(x) = h, where x is the distance that
maximizes g and h is the height of g() at x
computed from the double observer data frame (see below for
structure of the double observer data frame).
Distance of independence specified, height computed from double
observer system: Inputs are
x.scl
= a number greater than or equal to w.lo
g.x.scl
= a data frame. In this case, g(x.scl
) = h,
where h is computed from the double observer data frame
(see below for structure of the double observer data frame).
When x.scl
, g.x.scl
, or observer
are NULL, the routine
will look for and use $call.x.scl
, or $call.g.x.scl
, or
$call.observer
components of the fit
object for whichever
of these three parameters is missing. Later, different
values can be specified in a direct call to F.gx.estim
without having to re-estimate the distance function. Because of this feature,
the default values in dfuncEstim
are x.scl
= 0 and
g.x.scl
= 1 and observer
= "both".
A list comprised of the following components:
x.scl |
The value of x (distance) at which g() is evaluated. |
comp2 |
The estimated value of g() when evaluated at |
When g.x.scl
is a data frame, it is assumed to contain
the components $obsby.1
and $obsby.2
(no flexibility on names).
Each row in the data frame contains data from one sighted target.
The $obsby.1
and $obsby.2
components are
TRUE/FALSE (logical) vectors indicating whether
observer 1 (obsby.1
) or observer 2 (obsby.2
) spotted the target.
data(sparrowDf) fit <- dfuncEstim(sparrowDf, dist ~ groupsize(groupsize)) gxEstim(fit) fit <- dfuncEstim(sparrowDf, dist ~ groupsize(groupsize) , x.scl = units::set_units(50,"m") , g.x.scl = 0.75) gxEstim(fit) plot(fit) abline(h=0.75) abline(v=units::set_units(50,"m"))
data(sparrowDf) fit <- dfuncEstim(sparrowDf, dist ~ groupsize(groupsize)) gxEstim(fit) fit <- dfuncEstim(sparrowDf, dist ~ groupsize(groupsize) , x.scl = units::set_units(50,"m") , g.x.scl = 0.75) gxEstim(fit) plot(fit) abline(h=0.75) abline(v=units::set_units(50,"m"))
Evaluate the half-normal distance function, for sighting distances, potentially including covariates and expansion terms
halfnorm.like(a, dist, covars)
halfnorm.like(a, dist, covars)
a |
A vector or matrix of covariate
and expansion term
coefficients. Dimension is $k$ X $p$, where
$k$ (i.e., |
dist |
A numeric vector of length $n$ or a single-column matrix (dimension $n$X1) containing detection distances at which to evaluate the likelihood. |
covars |
A numeric vector of length $q$ or
matrix of dimension $n$X$q$ containing covariate values
associated with distances in argument |
The half-normal distance function is
where ,
is a vector of
covariate values associated with distance
(i.e., a row of
covars
), and
is a vector of the first $q$ (=
ncol(covars)
)
values in argument a
.
Some authors parameterize the halfnorm without
the "2" in the denominator of the exponent.
Rdistance
includes
"2" in this denominator to make
quantiles of the half normal agree with
the standard normal. This means that half-normal
coefficients in
Rdistance (i.e., $s = exp(x'a)$) can be
interpreted as normal standard errors. For example,
approximately 95% of distances should
occur between 0 and 2$s$.
A list containing the following two components:
L.unscaled: A matrix of size $n$X$k$X$b$
containing likelihood values evaluated at
distances in dist
.
Each row is associated with
a single distance, and each column is associated with
a single case (row of a
). This matrix is
"unscaled" because the underlying likelihood does
not integrate to one. Values in L.unscaled
are always greater than or equal to zero.
params: A $n$X$k$X$b$ array of the
likelihood's (canonical) parameters, First page contains
parameter values related to covariates (i.e., $s = exp(x'a)$),
while subsequent pages contain other parameters.
$b$ = 1 for halfnorm, negexp; $b$ = 2 for hazrate and
others.
Rows correspond to distances in dist
. Columns
correspond to rows from argument a
.
dfuncEstim
,
hazrate.like
,
negexp.like
d <- seq(0, 100, length=100) covs <- matrix(1,length(d),1) halfnorm.like(log(20), d, covs) plot(d, halfnorm.like(log(20), d, covs)$L.unscaled, type="l", col="red") lines(d, halfnorm.like(log(40), d, covs)$L.unscaled, col="blue") # Matrix inputs: d <- matrix(c(0,10,20), ncol = 1) # 3X1 covs <- matrix(c(rep(1,nrow(d)), rep(.5,nrow(d))), nrow = nrow(d)) # 3X2 coefs <- matrix(log(c(15,5,10,10)), nrow=2) # 2X2 L <- halfnorm.like( coefs, d, covs ) L$L.unscaled # 3X2 L$params # 3X2; exp(log(15)+0.5log(10)) and exp(log(5)+0.5log(10))
d <- seq(0, 100, length=100) covs <- matrix(1,length(d),1) halfnorm.like(log(20), d, covs) plot(d, halfnorm.like(log(20), d, covs)$L.unscaled, type="l", col="red") lines(d, halfnorm.like(log(40), d, covs)$L.unscaled, col="blue") # Matrix inputs: d <- matrix(c(0,10,20), ncol = 1) # 3X1 covs <- matrix(c(rep(1,nrow(d)), rep(.5,nrow(d))), nrow = nrow(d)) # 3X2 coefs <- matrix(log(c(15,5,10,10)), nrow=2) # 2X2 L <- halfnorm.like( coefs, d, covs ) L$L.unscaled # 3X2 L$params # 3X2; exp(log(15)+0.5log(10)) and exp(log(5)+0.5log(10))
Compute starting values and limits for the half normal distance function.
halfnorm.start.limits(ml)
halfnorm.start.limits(ml)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
A list containing the following components
start |
Vector of starting values for parameters of the likelihood and expansion terms. |
lowlimit |
Vector of lower limits for the likelihood parameters and expansion terms. |
uplimit |
Vector of upper limits for the likelihood parameters and expansion terms. |
names |
Vector of names for the likelihood parameters and expansion terms. |
The length of each vector in the return is:
(Num expansions) + 1 + 1*(like %in% c("hazrate")) + (Num Covars)
.
Computes the hazard rate distance function.
hazrate.like(a, dist, covars)
hazrate.like(a, dist, covars)
a |
A vector or matrix of covariate
and expansion term
coefficients. Dimension is $k$ X $p$, where
$k$ (i.e., |
dist |
A numeric vector of length $n$ or a single-column matrix (dimension $n$X1) containing detection distances at which to evaluate the likelihood. |
covars |
A numeric vector of length $q$ or
matrix of dimension $n$X$q$ containing covariate values
associated with distances in argument |
The hazard rate likelihood is
where determines location
(i.e., distance at which the function equals 1 - exp(-1) = 0.632),
and
determines slope of the function
at
(i.e., larger k equals steeper
slope at
). For distance analysis,
the valid range for both
and k is
.
A list containing the following two components:
L.unscaled: A matrix of size $n$X$k$X$b$
containing likelihood values evaluated at
distances in dist
.
Each row is associated with
a single distance, and each column is associated with
a single case (row of a
). This matrix is
"unscaled" because the underlying likelihood does
not integrate to one. Values in L.unscaled
are always greater than or equal to zero.
params: A $n$X$k$X$b$ array of the
likelihood's (canonical) parameters, First page contains
parameter values related to covariates (i.e., $s = exp(x'a)$),
while subsequent pages contain other parameters.
$b$ = 1 for halfnorm, negexp; $b$ = 2 for hazrate and
others.
Rows correspond to distances in dist
. Columns
correspond to rows from argument a
.
dfuncEstim
,
hazrate.like
,
negexp.like
d <- seq(0, 100, length=100) covs <- matrix(1,length(d),1) hazrate.like(c(log(20), 5), d, covs) # Changing location parameter plot(d, hazrate.like(c(log(20), 5), d, covs)$L.unscaled, type="l", col="red") lines(d, hazrate.like(c(log(40), 5), d, covs)$L.unscaled, col="blue") abline(h = 1 - exp(-1), lty = 2) abline(v = c(20,40), lty = 2) # Changing slope parameter plot(d, hazrate.like(c(log(50), 20), d, covs)$L.unscaled, type="l", col="red") lines(d, hazrate.like(c(log(50), 2), d, covs)$L.unscaled, col="blue") abline(h = 1 - exp(-1), lty = 2) abline(v = 50, lty = 2)
d <- seq(0, 100, length=100) covs <- matrix(1,length(d),1) hazrate.like(c(log(20), 5), d, covs) # Changing location parameter plot(d, hazrate.like(c(log(20), 5), d, covs)$L.unscaled, type="l", col="red") lines(d, hazrate.like(c(log(40), 5), d, covs)$L.unscaled, col="blue") abline(h = 1 - exp(-1), lty = 2) abline(v = c(20,40), lty = 2) # Changing slope parameter plot(d, hazrate.like(c(log(50), 20), d, covs)$L.unscaled, type="l", col="red") lines(d, hazrate.like(c(log(50), 2), d, covs)$L.unscaled, col="blue") abline(h = 1 - exp(-1), lty = 2) abline(v = 50, lty = 2)
Compute starting values and limits for the hazard rate distance function.
hazrate.start.limits(ml)
hazrate.start.limits(ml)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
A list containing the following components
start |
Vector of starting values for parameters of the likelihood and expansion terms. |
lowlimit |
Vector of lower limits for the likelihood parameters and expansion terms. |
uplimit |
Vector of upper limits for the likelihood parameters and expansion terms. |
names |
Vector of names for the likelihood parameters and expansion terms. |
The length of each vector in the return is:
(Num expansions) + 1 + 1*(like %in% c("hazrate")) + (Num Covars)
.
Computes the Hermite expansion terms used in the likelihood of a distance analysis. More generally, will compute a Hermite expansion of any numeric vector.
hermite.expansion(x, expansions)
hermite.expansion(x, expansions)
x |
In a distance analysis, |
expansions |
A scalar specifying the number of expansion terms to compute. Must be one of the integers 1, 2, 3, or 4. |
There are, in general, several expansions that can be called Hermite. The Hermite expansion used here is:
First term:
Second term:
Third term:
Fourth term:
The maximum number of expansion terms computed is 4.
A matrix of size length(x)
X expansions
. The columns of this matrix are the Hermite polynomial expansions of x
.
Column 1 is the first expansion term of x
, column 2 is the second expansion term of x
, and so on up to expansions
.
dfuncEstim
, cosine.expansion
, simple.expansion
, and the discussion
of user defined likelihoods in dfuncEstim
.
x <- seq(0, 1, length = 200) herm.expn <- hermite.expansion(x, 4) plot(range(x), range(herm.expn), type="n") matlines(x, herm.expn, col=rainbow(4), lty = 1)
x <- seq(0, 1, length = 200) herm.expn <- hermite.expansion(x, 4) plot(range(x), range(herm.expn), type="n") matlines(x, herm.expn, col=rainbow(4), lty = 1)
Utility function to detect whether a distance function has covariates beyond the intercept. If the model contains an intercept-only, effective distance is constant across detections and short-cuts can be implemented in code.
intercept.only(object)
intercept.only(object)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
TRUE if object
contains an intercept-only.
FALSE if object
contains at least one detection-level
or transect-level covariate in the detection function.
Determines whether a distance function is for a point survey or line survey.
is.points(x)
is.points(x)
x |
Either an estimated distance function, output by
|
TRUE if the model frame or fitted distance function contains point surveys. FALSE if the model frame or distance function contains line transect surveys.
Checks the validity of Rdistance
nested data frames.
Rdistance
data frames
are a particular implementation of rowwise tibbles
that contain detections in a list column, and extra attributes
specifying types.
is.RdistDf(df, verbose = FALSE)
is.RdistDf(df, verbose = FALSE)
df |
A data frame to check |
verbose |
If TRUE, an explanation of the check that fails is printed. Otherwise, no information on checks is provided. |
The following checks are performed (in this order):
attr(df, "detectionColumn")
exists and points to a valid
list-based column in the data frame.
attr(df, "obsType")
exists and is one of the valid values.
attr(df, "transType")
exists and is one of the valid values.
The data frame is either a 'rowwise_df' or 'grouped_df'
tibble
.
The data frame has only one row per group. One row per group
is implied by 'rowwise_df', but not a 'grouped_df', and both are allowed
in Rdistance
. One row per group ensures rows are uniquely identified
and hence represents one transect.
No column names in the list-column are duplicated in the non-list
columns of the data frame. This check ensures that tidyr::unnest
executes.
Other data checks, e.g., for measurement units, are performed
later in dfuncEstim
, after the model is specified.
TRUE or FALSE invisibly. TRUE means all checks passed. FALSE implies
at least one check failed. Use verbose
= TRUE to see which.
data(sparrowDf) is.RdistDf(sparrowDf) # Data frame okay, but no attributes data(sparrowDetectionData) data(sparrowSiteData) sparrowDf <- sparrowDetectionData |> dplyr::nest_by( siteID , .key = "distances") |> dplyr::right_join(sparrowSiteData, by = "siteID") is.RdistDf(sparrowDf, verbose = TRUE)
data(sparrowDf) is.RdistDf(sparrowDf) # Data frame okay, but no attributes data(sparrowDetectionData) data(sparrowSiteData) sparrowDf <- sparrowDetectionData |> dplyr::nest_by( siteID , .key = "distances") |> dplyr::right_join(sparrowSiteData, by = "siteID") is.RdistDf(sparrowDf, verbose = TRUE)
Determines whether a distance function is a non-parametric smooth or classic parameterized function.
is.smoothed(object)
is.smoothed(object)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
TRUE if the model frame or fitted distance function arises from a non-parametric density smoother. FALSE if the model frame or distance function is a parameterized function.
Tests whether a 'units' object is actually
unitless.
Unitless objects, such as ratios, should be assigned
units of '[1]'. Often they are, but
sometimes unitless ratios are assigned units like '[m/m]'.
The units
package should always convert '[m/m]' to
'[1]', but it does not always.
Sometimes units like '[m/m]' mess things up, so it is
better to remove them before calculations.
is.Unitless(obj)
is.Unitless(obj)
obj |
A numeric scalar or vector, with or without units. |
TRUE if obj
has units and they
are either '[1]' or the denominator units equal
the numerator units. Otherwise, return FALSE.
If obj
does not have units, this routine
returns TRUE.
a <- units::set_units(2, "m") b <- a / a is.Unitless(a) is.Unitless(b) is.Unitless(3)
a <- units::set_units(2, "m") b <- a / a is.Unitless(a) is.Unitless(b) is.Unitless(3)
Returns names of the likelihood parameters. This is
a helper function and is not
necessary for estimation. It is a nice to label some
outputs in Rdistance
with
parameter names like "sigma" or "knee", depending on the likelihood,
and this routine provides a way to do that.
likeParamNames(like.form)
likeParamNames(like.form)
like.form |
A text string naming the form of the likelihood. |
For user defined functions, ensure that the
user defined start-limits function named <likelihood>.start.limits
can be evaluated on a distance of 1, can accept 0 expansions,
a low limit of 0
a high limit of 1, and that it returns the parameter names as
the $names
component of the result. That is, the
code that returns user-defined parameter names is,
fn <- match.fun( paste0(like.form, ".start.limits"));
ans <- fn(1, 0, 0, 1);
ans$names
A vector of parameter names for that likelihood
Line plot method for objects of class 'dfunc
'
that adds distance functions to an existing plot.
## S3 method for class 'dfunc' lines(x, newdata = NULL, prob = NULL, ...)
## S3 method for class 'dfunc' lines(x, newdata = NULL, prob = NULL, ...)
x |
An estimated detection function object, normally
produced by calling |
newdata |
A data frame containing new values of
the covariates at which to evaluate the distance functions.
If |
prob |
Logical scalar for whether to scale the distance function
to be a density function (integrates to one). Default behavior is designed
to be compatible with the plot method for distance functions
( |
... |
Parameters passed to |
A data frame containing the x and y coordinates of the
plotted line(s) is returned invisibly. X coordinates in the
return are names x
. Y coordinates in the return are named
y1, y2, ..., yn
, i.e., one column per returned
distance function.
dfuncEstim
, plot.dfunc
,
print.abund
set.seed(87654) x <- rnorm(1000, mean=0, sd=20) x <- x[x >= 0] x <- units::set_units(x, "mi") Df <- data.frame(transectID = "A" , distance = x ) |> dplyr::nest_by( transectID , .key = "detections") |> dplyr::mutate(length = units::set_units(100,"km")) attr(Df, "detectionColumn") <- "detections" attr(Df, "obsType") <- "single" attr(Df, "transType") <- "line" attr(Df,'effortColumn') <- "length" is.RdistDf(Df) # TRUE dfunc <- Df |> dfuncEstim(distance ~ 1, likelihood="halfnorm") plot(dfunc, nbins = 40, col="lightgrey", border=NA, vertLines=FALSE) lines(dfunc, col="grey30", lwd=15) lines(dfunc, col="grey90", lwd=5, lty = 2) # Multiple lines data(sparrowDfuncObserver) obsLevs <- levels(sparrowDfuncObserver$data$observer) plot(sparrowDfuncObserver , vertLines = FALSE , lty = 0 , plotBars = FALSE , main="Detection by observer" , legend = FALSE) y <- lines(sparrowDfuncObserver , newdata = data.frame(observer = obsLevs) , col = palette.colors(length(obsLevs)) , lty = 1 , lwd = 4) head(y) # values returned, with distances as column
set.seed(87654) x <- rnorm(1000, mean=0, sd=20) x <- x[x >= 0] x <- units::set_units(x, "mi") Df <- data.frame(transectID = "A" , distance = x ) |> dplyr::nest_by( transectID , .key = "detections") |> dplyr::mutate(length = units::set_units(100,"km")) attr(Df, "detectionColumn") <- "detections" attr(Df, "obsType") <- "single" attr(Df, "transType") <- "line" attr(Df,'effortColumn') <- "length" is.RdistDf(Df) # TRUE dfunc <- Df |> dfuncEstim(distance ~ 1, likelihood="halfnorm") plot(dfunc, nbins = 40, col="lightgrey", border=NA, vertLines=FALSE) lines(dfunc, col="grey30", lwd=15) lines(dfunc, col="grey90", lwd=5, lty = 2) # Multiple lines data(sparrowDfuncObserver) obsLevs <- levels(sparrowDfuncObserver$data$observer) plot(sparrowDfuncObserver , vertLines = FALSE , lty = 0 , plotBars = FALSE , main="Detection by observer" , legend = FALSE) y <- lines(sparrowDfuncObserver , newdata = data.frame(observer = obsLevs) , col = palette.colors(length(obsLevs)) , lty = 1 , lwd = 4) head(y) # values returned, with distances as column
Find the x coordinate that maximizes g(x).
maximize.g(fit, covars = NULL)
maximize.g(fit, covars = NULL)
fit |
An estimated 'dfunc' object produced by |
covars |
Covariate values to calculate g(x). |
The value of x that maximizes g(x) in fit
.
## Not run: # Fake data set.seed(22223333) x <- rgamma(100, 10, 1) fit <- dfuncEstim( x, likelihood="Gamma", x.scl="max" ) maximize.g( fit ) # should be near 10. fit$x.scl # same thing ## End(Not run)
## Not run: # Fake data set.seed(22223333) x <- rgamma(100, 10, 1) fit <- dfuncEstim( x, likelihood="Gamma", x.scl="max" ) maximize.g( fit ) # should be near 10. fit$x.scl # same thing ## End(Not run)
Estimate parameters of a distance function using maximum likelihood.
mlEstimates(ml, strt.lims)
mlEstimates(ml, strt.lims)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
strt.lims |
A list containing start, low, and high limits for
parameters of the requested likelihood. This list is typically produced
by a call to |
An Rdistance fitted model object. This object contains the
raw object returned by the optimization routine (e.g., nlming
),
and additional components specific to Rdistance.
Extract the model matrix ("X" matrix) from an Rdistance model object.
## S3 method for class 'dfunc' model.matrix(object, ...)
## S3 method for class 'dfunc' model.matrix(object, ...)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
... |
Ignored |
A matrix containing covariates for fitting an Rdistance model.
data(sparrowDf) sparrowModel <- parseModel( sparrowDf, dist ~ observer ) model.matrix(sparrowModel)
data(sparrowDf) sparrowModel <- parseModel( sparrowDf, dist ~ observer ) model.matrix(sparrowModel)
Return number of covariates in a distance model
nCovars(X)
nCovars(X)
X |
The X matrix of covariates, or a vector. |
The reason this routine is needed is that sometimes we pass one row of covariates to a likelihood function. If so, it may come in as a normal vector, not a matrix. If a normal vector, ncol(X) does not work.
An integer scalar
# do not export
Computes the negative exponential distance function.
negexp.like(a, dist, covars)
negexp.like(a, dist, covars)
a |
A vector or matrix of covariate
and expansion term
coefficients. Dimension is $k$ X $p$, where
$k$ (i.e., |
dist |
A numeric vector of length $n$ or a single-column matrix (dimension $n$X1) containing detection distances at which to evaluate the likelihood. |
covars |
A numeric vector of length $q$ or
matrix of dimension $n$X$q$ containing covariate values
associated with distances in argument |
The negative exponential likelihood is
where is the
slope parameter.
A list containing the following two components:
L.unscaled: A matrix of size $n$X$k$X$b$
containing likelihood values evaluated at
distances in dist
.
Each row is associated with
a single distance, and each column is associated with
a single case (row of a
). This matrix is
"unscaled" because the underlying likelihood does
not integrate to one. Values in L.unscaled
are always greater than or equal to zero.
params: A $n$X$k$X$b$ array of the
likelihood's (canonical) parameters, First page contains
parameter values related to covariates (i.e., $s = exp(x'a)$),
while subsequent pages contain other parameters.
$b$ = 1 for halfnorm, negexp; $b$ = 2 for hazrate and
others.
Rows correspond to distances in dist
. Columns
correspond to rows from argument a
.
dfuncEstim
,
hazrate.like
,
negexp.like
d <- seq(0, 100, length=100) covs <- matrix(1,length(d),1) negexp.like(log(0.01), d, covs) # Changing slope parameter plot(d, negexp.like(log(0.1), d, covs)$L.unscaled, type="l", col="red") lines(d, negexp.like(log(0.05), d, covs)$L.unscaled, col="blue")
d <- seq(0, 100, length=100) covs <- matrix(1,length(d),1) negexp.like(log(0.01), d, covs) # Changing slope parameter plot(d, negexp.like(log(0.1), d, covs)$L.unscaled, type="l", col="red") lines(d, negexp.like(log(0.05), d, covs)$L.unscaled, col="blue")
Compute starting values and limits for the negative exponential distance function.
negexp.start.limits(ml)
negexp.start.limits(ml)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
A list containing the following components
start |
Vector of starting values for parameters of the likelihood and expansion terms. |
lowlimit |
Vector of lower limits for the likelihood parameters and expansion terms. |
uplimit |
Vector of upper limits for the likelihood parameters and expansion terms. |
names |
Vector of names for the likelihood parameters and expansion terms. |
The length of each vector in the return is:
(Num expansions) + 1 + 1*(like %in% c("hazrate")) + (Num Covars)
.
Return the negative log likelihood of observed detection distances given a likelihood and the estimated parameters.
nLL(a, ml)
nLL(a, ml)
a |
A vector of likelihood parameter values. Length and
meaning depend on |
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
Expansion Terms: If ml$expansions
= k (k > 0),
the expansion function specified by ml$series
is
called (see for example cosine.expansion
).
Assuming is the
expansion term for the
distance and that
are (estimated) coefficients for the expansion terms,
the likelihood contribution for the
distance is,
A scalar, the negative of the log likelihood evaluated at
parameters a
.
See halfnorm.like
and links there;
dfuncEstim
set.seed(238642) d <- rnorm(1000, mean = 0, sd = 40) d <- units::set_units(d[0 <= d], "m") # Min info in model list to compute likelihood ml <- list( mf = model.frame(d ~ 1) , likelihood = "halfnorm" , expansions = 0 , w.lo = units::set_units(0, "m") , w.hi = units::set_units(125, "m") , outputUnits = units(units::set_units(1,"m")) , x.scl = units::set_units(0,"m") , g.x.scl = 1 , data = 1 ) attr(ml$data, "transType") <- "line" class(ml) <- "dfunc" nLL(log(40), ml) # Another way, b/c we have pnorm() ones <- matrix(1, nrow = length(d), ncol = 1) l <- halfnorm.like(log(40), d, ones) scaler <-(pnorm(units::drop_units(ml$w.hi) , units::drop_units(ml$w.lo) , sd = l$params) - 0.5) * sqrt(2*pi) * l$params -sum(log(l$L.unscaled/scaler)) # A third way, b/c we have pnorm() and dnorm(). l2 <- dnorm(units::drop_units(d), mean = 0, sd = 40) scaler2 <- pnorm(125, mean = 0, sd = 40) - 0.5 -sum(log(l2/scaler2))
set.seed(238642) d <- rnorm(1000, mean = 0, sd = 40) d <- units::set_units(d[0 <= d], "m") # Min info in model list to compute likelihood ml <- list( mf = model.frame(d ~ 1) , likelihood = "halfnorm" , expansions = 0 , w.lo = units::set_units(0, "m") , w.hi = units::set_units(125, "m") , outputUnits = units(units::set_units(1,"m")) , x.scl = units::set_units(0,"m") , g.x.scl = 1 , data = 1 ) attr(ml$data, "transType") <- "line" class(ml) <- "dfunc" nLL(log(40), ml) # Another way, b/c we have pnorm() ones <- matrix(1, nrow = length(d), ncol = 1) l <- halfnorm.like(log(40), d, ones) scaler <-(pnorm(units::drop_units(ml$w.hi) , units::drop_units(ml$w.lo) , sd = l$params) - 0.5) * sqrt(2*pi) * l$params -sum(log(l$L.unscaled/scaler)) # A third way, b/c we have pnorm() and dnorm(). l2 <- dnorm(units::drop_units(d), mean = 0, sd = 40) scaler2 <- pnorm(125, mean = 0, sd = 40) - 0.5 -sum(log(l2/scaler2))
Return the type of observations (single or multiple observers) represented in either a fitted distance function or Rdistance data frame.
observationType(x)
observationType(x)
x |
Either an estimated distance function, output by
|
This function is a simple helper function. If x
is an
estimated distance object, it polls the obsType
attribute of the object's Rdistance data frame. If x
is an Rdistance nested data frame, it
polls the obsType
attribute.
One of the following values: "single", "1given2", "2given1", or "both". If observation type has not been assigned, return is NULL.
An internal (un-exported) function to perform density and abundance calculations on one iteration of the bootstrap.
oneBsIter( indexDf, key, data, formula, likelihood, w.lo, w.hi, expansions, series, x.scl, g.x.scl, outputUnits, warn, area, propUnitSurveyed, pb, plot.bs, plotCovValues )
oneBsIter( indexDf, key, data, formula, likelihood, w.lo, w.hi, expansions, series, x.scl, g.x.scl, outputUnits, warn, area, propUnitSurveyed, pb, plot.bs, plotCovValues )
indexDf |
A data frame containing row indices to use
for subsetting the rows of |
key |
A data frame containing the current id of the
BS iteration. This is included for compatability with
|
data |
An Rdistance nested data frame containing
the data to bootstrap resample. Rows of this data frame,
equating to transects, are sampled using the indicies in
|
formula |
A standard formula object. For example, |
likelihood |
String specifying the likelihood to fit. Built-in likelihoods at present are "halfnorm", "hazrate", and "negexp". |
w.lo |
Lower or left-truncation limit of the distances in distance data.
This is the minimum possible off-transect distance. Default is 0. If
|
w.hi |
Upper or right-truncation limit of the distances
in |
expansions |
A scalar specifying the number of terms
in |
series |
If |
x.scl |
The x coordinate (a distance) at which the
detection function will be scaled. |
g.x.scl |
Height of the distance function at coordinate |
outputUnits |
A string specifying the symbolic measurement
units for results. Valid units are listed in |
warn |
A logical scalar specifying whether to issue
an R warning if the estimation did not converge or if one
or more parameter estimates are at their boundaries.
For estimation, |
area |
A scalar containing the total area of inference. Usually, this is
study area size. If |
propUnitSurveyed |
A scalar or vector of real numbers between 0 and 1.
The proportion of the default sampling unit that
was surveyed. If both sides of line transects were observed,
|
pb |
A progress bar created with |
plot.bs |
Logical. Whether to plot bootstrap estimate of detection function.
A plot must already exist because this uses |
plotCovValues |
Data frame containing values of covariates to plot,
if |
A data frame containing density and abundance and other relevant statistics for one iteration of the bootstrap.
Parse an 'Rdistance' formula and produce a list containing all model parameters.
parseModel( data, formula = NULL, likelihood = "halfnorm", w.lo = 0, w.hi = NULL, expansions = 0, series = "cosine", x.scl = 0, g.x.scl = 1, outputUnits = NULL )
parseModel( data, formula = NULL, likelihood = "halfnorm", w.lo = 0, w.hi = NULL, expansions = 0, series = "cosine", x.scl = 0, g.x.scl = 1, outputUnits = NULL )
data |
An |
formula |
A standard formula object. For example, |
likelihood |
String specifying the likelihood to fit. Built-in likelihoods at present are "halfnorm", "hazrate", and "negexp". |
w.lo |
Lower or left-truncation limit of the distances in distance data.
This is the minimum possible off-transect distance. Default is 0. If
|
w.hi |
Upper or right-truncation limit of the distances
in |
expansions |
A scalar specifying the number of terms
in |
series |
If |
x.scl |
The x coordinate (a distance) at which the
detection function will be scaled. |
g.x.scl |
Height of the distance function at coordinate |
outputUnits |
A string specifying the symbolic measurement
units for results. Valid units are listed in |
This routine is not intended to be called by the user. It is called
from the model estimation routines in Rdistance
.
An Rdistance model frame, which is an object of class "dfunc". Rdistance model frames are lists containing distance model components but not estimates. Model frames contain everything necessary to fit an Rdistance mode, such as covariates, minimum and maximum distances, the form of the likelihood, number of expansions, etc. Rdistance model frames contain a subset of fitted Rdistance model components.
[RdistDf()], which returns an Rdistance data frame; [dfuncEstim()], which returns an Rdistance fitted model.
data(sparrowSiteData) data(sparrowDetectionData) sparrowDf <- Rdistance::RdistDf(sparrowSiteData , sparrowDetectionData , by = NULL , pointSurvey = FALSE , observer = "single" , .detectionCol = "detections") ml <- Rdistance::parseModel(sparrowDf , formula = dist ~ 1 + observer + groupsize(groupsize) , likelihood = "halfnorm" , w.lo = 0 , w.hi = NULL , series = "cosine" , x.scl = 0 , g.x.scl = 1 , outputUnits = "m" ) class(ml) # 'dfunc', but no estimated coefficients print(ml) print.default(ml)
data(sparrowSiteData) data(sparrowDetectionData) sparrowDf <- Rdistance::RdistDf(sparrowSiteData , sparrowDetectionData , by = NULL , pointSurvey = FALSE , observer = "single" , .detectionCol = "detections") ml <- Rdistance::parseModel(sparrowDf , formula = dist ~ 1 + observer + groupsize(groupsize) , likelihood = "halfnorm" , w.lo = 0 , w.hi = NULL , series = "cosine" , x.scl = 0 , g.x.scl = 1 , outputUnits = "m" ) class(ml) # 'dfunc', but no estimated coefficients print(ml) print.default(ml)
Computes off-transect (also called 'perpendicular') distances from measures of sighting distance and sighting angle.
perpDists(sightDist, sightAngle, data)
perpDists(sightDist, sightAngle, data)
sightDist |
Character, name of column in |
sightAngle |
Character, name of column in |
data |
data.frame object containing sighting distance and sighting angle. |
If observers recorded sighting distance and sighting angle (as is often common in line transect surveys), use this function to convert
to off-transect distances, the required input data for dfunc.estim
.
A vector of off-transect (or perpendicular) distances. Units are the same as sightDist
.
Buckland, S.T., Anderson, D.R., Burnham, K.P. and Laake, J.L. 1993. Distance Sampling: Estimating Abundance of Biological Populations. Chapman and Hall, London.
# Load the example dataset of sparrow detections from package data(sparrowDetectionData) # Compute perpendicular, off-transect distances from the observer's sight distance and angle sparrowDetectionData$perpDist <- perpDists(sightDist="sightdist", sightAngle="sightangle", data=sparrowDetectionData)
# Load the example dataset of sparrow detections from package data(sparrowDetectionData) # Compute perpendicular, off-transect distances from the observer's sight distance and angle sparrowDetectionData$perpDist <- perpDists(sightDist="sightdist", sightAngle="sightangle", data=sparrowDetectionData)
Plot method for objects of class 'dfunc
'. Objects of
class 'dfunc
' are estimated distance functions produced by
dfuncEstim
.
## S3 method for class 'dfunc' plot(x, ...)
## S3 method for class 'dfunc' plot(x, ...)
x |
An estimated detection function object, normally
produced by calling |
... |
Arguments passed on to
|
If plotBars
is TRUE, a scaled histogram is plotted
and the estimated distance function
is plotted over the top of it. When bars are plotted,
this routine uses graphics::barplot
for setting up the initial plotting region and
most parameters to graphics::barplot
can
be specified (exceptions noted above in description of '...').
The form of the likelihood and any series
expansions is printed in the main title (overwrite this with
main="<my title>"
). Convergence of the distance
function is checked. If the distance function did not converge, a warning
is printed over the top of the histogram. If one or more parameter
estimates are at their limits (likely indicating non-convergence or poor
fit), another warning is printed.
The input distance function is returned, with two additional
components than can be used to reconstruct the plotted bars. (To
obtain values of the plotted distance functions, use predict
with type = "distances"
.)
The additional components are:
barHeights |
A vector containing the scaled bar heights drawn on the plot. |
barWidths |
A vector or scalar of the bar widths drawn on the plot, with measurement units. |
Re-plot the bars with barplot( return$barHeights,
width = return$barWidths )
.
dfuncEstim
, print.dfunc
,
print.abund
set.seed(87654) x <- rnorm(1000, mean=0, sd=20) x <- x[x >= 0] x <- units::set_units(x, "ft") Df <- data.frame(transectID = "A" , distance = x ) |> dplyr::nest_by( transectID , .key = "detections") |> dplyr::mutate(length = units::set_units(1,"mi")) attr(Df, "detectionColumn") <- "detections" attr(Df, "obsType") <- "single" attr(Df, "transType") <- "line" attr(Df, "effortColumn") <- "length" is.RdistDf(Df) # TRUE dfunc <- Df |> dfuncEstim(distance ~ 1, likelihood="halfnorm") plot(dfunc) plot(dfunc, nbins=25) # showing effects of plot params plot(dfunc , col=c("red","blue","orange") , border="black" , xlab="Off-transect distance" , ylab="Prob" , vertLines = FALSE , main="Showing plot params") plot(dfunc , col="purple" , density=30 , angle=c(-45,0,45) , cex.axis=1.5 , cex.lab=2 , ylab="Probability") plot(dfunc , col=c("grey","lightgrey") , border=NA) plot(dfunc , col="grey" , border=0 , col.dfunc="blue" , lty.dfunc=2 , lwd.dfunc=4 , vertLines=FALSE) plot(dfunc , plotBars=FALSE , cex.axis=1.5 , col.axis="blue") rug(distances(dfunc)) # Plot showing f(0) hist(distances(dfunc) , n = 40 , border = NA , prob = TRUE) x <- seq(dfunc$w.lo, dfunc$w.hi, length=200) g <- predict(dfunc, type="dfunc", distances = x, newdata = data.frame(a=1)) f <- g[,1] / ESW(dfunc)[1] # Check integration: sum(diff(x)*(f[-1] + f[-length(f)]) / 2) # Trapazoid rule; should be 1.0 lines(x, f) # hence, 1/f(0) = ESW # Covariates: detection by observer data(sparrowDfuncObserver) # pre-estimated model obsLevs <- levels(sparrowDfuncObserver$data$observer) plot(sparrowDfuncObserver , newdata = data.frame(observer = obsLevs) , vertLines = FALSE , col.dfunc = heat.colors(length(obsLevs)) , col = c("grey","lightgrey") , border=NA , main="Detection by observer")
set.seed(87654) x <- rnorm(1000, mean=0, sd=20) x <- x[x >= 0] x <- units::set_units(x, "ft") Df <- data.frame(transectID = "A" , distance = x ) |> dplyr::nest_by( transectID , .key = "detections") |> dplyr::mutate(length = units::set_units(1,"mi")) attr(Df, "detectionColumn") <- "detections" attr(Df, "obsType") <- "single" attr(Df, "transType") <- "line" attr(Df, "effortColumn") <- "length" is.RdistDf(Df) # TRUE dfunc <- Df |> dfuncEstim(distance ~ 1, likelihood="halfnorm") plot(dfunc) plot(dfunc, nbins=25) # showing effects of plot params plot(dfunc , col=c("red","blue","orange") , border="black" , xlab="Off-transect distance" , ylab="Prob" , vertLines = FALSE , main="Showing plot params") plot(dfunc , col="purple" , density=30 , angle=c(-45,0,45) , cex.axis=1.5 , cex.lab=2 , ylab="Probability") plot(dfunc , col=c("grey","lightgrey") , border=NA) plot(dfunc , col="grey" , border=0 , col.dfunc="blue" , lty.dfunc=2 , lwd.dfunc=4 , vertLines=FALSE) plot(dfunc , plotBars=FALSE , cex.axis=1.5 , col.axis="blue") rug(distances(dfunc)) # Plot showing f(0) hist(distances(dfunc) , n = 40 , border = NA , prob = TRUE) x <- seq(dfunc$w.lo, dfunc$w.hi, length=200) g <- predict(dfunc, type="dfunc", distances = x, newdata = data.frame(a=1)) f <- g[,1] / ESW(dfunc)[1] # Check integration: sum(diff(x)*(f[-1] + f[-length(f)]) / 2) # Trapazoid rule; should be 1.0 lines(x, f) # hence, 1/f(0) = ESW # Covariates: detection by observer data(sparrowDfuncObserver) # pre-estimated model obsLevs <- levels(sparrowDfuncObserver$data$observer) plot(sparrowDfuncObserver , newdata = data.frame(observer = obsLevs) , vertLines = FALSE , col.dfunc = heat.colors(length(obsLevs)) , col = c("grey","lightgrey") , border=NA , main="Detection by observer")
Plot method for parametric line and point transect distance functions.
## S3 method for class 'dfunc.para' plot( x, include.zero = FALSE, nbins = "Sturges", newdata = NULL, legend = TRUE, vertLines = TRUE, plotBars = TRUE, circles = FALSE, density = -1, angle = 45, xlab = NULL, ylab = NULL, border = TRUE, col = "grey85", col.dfunc = NULL, lty.dfunc = NULL, lwd.dfunc = NULL, ... )
## S3 method for class 'dfunc.para' plot( x, include.zero = FALSE, nbins = "Sturges", newdata = NULL, legend = TRUE, vertLines = TRUE, plotBars = TRUE, circles = FALSE, density = -1, angle = 45, xlab = NULL, ylab = NULL, border = TRUE, col = "grey85", col.dfunc = NULL, lty.dfunc = NULL, lwd.dfunc = NULL, ... )
x |
An estimated detection function object, normally
produced by calling |
include.zero |
Boolean value specifying whether to include 0 on the x-axis
of the plot. A value of TRUE will include 0 on the left hand end of the x-axis
regardless of the range of distances. A value of FALSE will plot only the
observation strip ( |
nbins |
Internally, this function uses |
newdata |
Data frame (similar to |
legend |
Logical scalar for whether to include a legend. If TRUE, a legend will be included on the plot detailing the covariate values used to generate the plot. |
vertLines |
Logical scalar specifying whether to plot vertical
lines at |
plotBars |
Logical scalar for whether to plot the histogram of distances behind the distance function. If FALSE, no histogram is plotted, only the distance function line(s). |
circles |
Logical scalar requesting the location of detection distances be plotted. If TRUE, open circles are plotted at predicted distance function heights associated with all detection distances. For computational simplicity, all distances are plotted for EVERY covariate class even though observed distances belong to only one covariate class. If FALSE, circles are not plotted. |
density |
If |
angle |
When |
xlab |
Label for the x-axis |
ylab |
Label for the y-axis |
border |
The color of bar borders when bars are plotted,
repeated as necessary to exceed the number of bars. A
value of NA or FALSE means no borders. If bars are shaded with lines
(i.e., |
col |
A vector of bar fill colors or line colors when bars are
drawn and |
col.dfunc |
Color of the distance function(s).
If only one distance function (one line) is being plotted,
the default color is "red".
If covariates or |
lty.dfunc |
Line type of the distance function(s).
If covariates or |
lwd.dfunc |
Line width of the distance function(s), replicated to the required length. Default is 2 for all lines. |
... |
When bars are plotted, this routine
uses |
The input distance function is returned, with two additional
components than can be used to reconstruct the plotted bars. (To
obtain values of the plotted distance functions, use predict
with type = "distances"
.)
The additional components are:
barHeights |
A vector containing the scaled bar heights drawn on the plot. |
barWidths |
A vector or scalar of the bar widths drawn on the plot, with measurement units. |
Re-plot the bars with barplot( return$barHeights,
width = return$barWidths )
.
# Example data set.seed(87654) x <- rnorm(1000, mean=0, sd=20) x <- x[x >= 0] x <- units::set_units(x, "ft") Df <- data.frame(transectID = "A" , distance = x ) |> dplyr::nest_by( transectID , .key = "detections") |> dplyr::mutate(length = units::set_units(1,"mi")) attr(Df, "detectionColumn") <- "detections" attr(Df, "obsType") <- "single" attr(Df, "transType") <- "line" attr(Df, "effortColumn") <- "length" is.RdistDf(Df) # TRUE # Estimation dfunc <- dfuncEstim(Df , formula = distance~1 , likelihood="halfnorm") plot(dfunc) plot(dfunc, nbins=25)
# Example data set.seed(87654) x <- rnorm(1000, mean=0, sd=20) x <- x[x >= 0] x <- units::set_units(x, "ft") Df <- data.frame(transectID = "A" , distance = x ) |> dplyr::nest_by( transectID , .key = "detections") |> dplyr::mutate(length = units::set_units(1,"mi")) attr(Df, "detectionColumn") <- "detections" attr(Df, "obsType") <- "single" attr(Df, "transType") <- "line" attr(Df, "effortColumn") <- "length" is.RdistDf(Df) # TRUE # Estimation dfunc <- dfuncEstim(Df , formula = distance~1 , likelihood="halfnorm") plot(dfunc) plot(dfunc, nbins=25)
An internal prediction method for computing density on the sampled transects.
predDensity(object, propUnitSurveyed = 1)
predDensity(object, propUnitSurveyed = 1)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
propUnitSurveyed |
A scalar or vector of real numbers between 0 and 1.
The proportion of the default sampling unit that
was surveyed. If both sides of line transects were observed,
|
A data frame containing the original data used to fit the distance function, plus an additional column containing the density of individuals on each transect.
data(sparrowDfuncObserver) predict(sparrowDfuncObserver, type="density")
data(sparrowDfuncObserver) predict(sparrowDfuncObserver, type="density")
An internal prediction function to predict a distance function. This version allows for matrix inputs and uses matrix operations, and is thus faster than earlier looping versions.
predDfuncs(object, params, distances, isSmooth)
predDfuncs(object, params, distances, isSmooth)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
params |
A matrix of distance function parameters. Rows are observations, columns contain the set of parameters (canonical and expansion) for each observation. |
distances |
A vector or 1-column matrix of
distances at which to evaluate
distance functions, when distance functions
are requested. |
isSmooth |
Logical. TRUE if the distance function is smoothed (and hence has no parameters). |
A matrix of distance function values, of size length(distances) X nrow(params). Each row of params is associated with a column, i.e., a different distance function. Distances are associated with rows, i.e., use matplot(d,return) to plot values on separate distance functions specified by rows of params.
Predict either likelihood parameters, distance functions, site-specific density, or site-specific abundance from estimated distance function objects.
## S3 method for class 'dfunc' predict( object, newdata = NULL, type = c("parameters"), distances = NULL, propUnitSurveyed = 1, area = NULL, ... )
## S3 method for class 'dfunc' predict( object, newdata = NULL, type = c("parameters"), distances = NULL, propUnitSurveyed = 1, area = NULL, ... )
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
newdata |
A data frame containing new values of
the covariates at which to evaluate the distance functions.
If |
type |
The type of predictions desired.
If |
distances |
A vector or 1-column matrix of
distances at which to evaluate
distance functions, when distance functions
are requested. |
propUnitSurveyed |
A scalar or vector of real numbers between 0 and 1.
The proportion of the default sampling unit that
was surveyed. If both sides of line transects were observed,
|
area |
A scalar containing the total area of inference. Usually, this is
study area size. If |
... |
Included for compatibility with generic |
A matrix containing predictions:
If type
is "parameters", the returned matrix
contains likelihood parameters. The extent of the
first dimension (rows) in the returned matrix is equal to
either the number of detection distances
in the observed strip or number of rows in newdata
.
The returned matrix's second dimension (columns) is
the number of parameters in the likelihood
plus the number of expansion terms.
See the help for each likelihoods to interpret
returned parameter values. All parameters are returned
on the inverse-link scale; i.e., exponential for canonical
parameters and identity for expansion terms.
If type
is "dfuncs" or "dfunc", columns of the
returned matrix contains detection functions (i.e., g(x)).
The extent of the first
dimension (number of rows) is either the number of distances
specified in distances
or options()$Rdistance_intEvalPts
if distances
is
not specified.
The extent of the second dimension (number of columns) is:
the number of detections with non-missing distances:
if newdata
is NULL.
the number of rows in newdata
if
newdata
is specified.
All distance functions in columns of the return are scaled
to object$g.x.scale
at object$x.scl
. The returned matrix has
the following additional attributes:
attr(return, "distances")
is the vector of distances used to
predict the function in return
. Either the input distances
object
or the computed sequence of distances when distances
is NULL.
attr(return, "x0")
is the vector of distances at which each
distance function in return
was scaled. i.e., the vector of
x.scl
.
attr(return, "g.x.scl")
is the height of g(x) (the distance
function) at x0.
If type
is "density" or "abundance", the return is a
tibble containing density and abundance estimates by transect.
All transects in the input data (i.e., object$data
) are
included, even those with missing lengths.
Columns in the tibble are:
transect ID: the grouping factor of the original RdistDf object.
individualsSeen: sum of non-missing group sizes on that transect.
avgPdetect: average probability of detection over groups sighted on that transect.
effort: size of the area surveyed by that transect.
density: density of individuals in the area surveyed by the transect.
abundance: abundance of individuals in the area surveyed by the transect.
halfnorm.like
, negexp.like
,
hazrate.like
data("sparrowDf") # For dimension checks: nd <- getOption("Rdistance_intEvalPts") # No covariates dfuncObs <- sparrowDf |> dfuncEstim(formula = dist ~ 1 , w.hi = units::as_units(100, "m")) n <- nrow(dfuncObs$mf) p <- predict(dfuncObs) # parameters all(dim(p) == c(n, 1)) # values in newdata ignored because no covariates p <- predict(dfuncObs, newdata = data.frame(x = 1:5)) all(dim(p) == c(5, 1)) # Distance functions in columns, one per observation p <- predict(dfuncObs, type = "dfunc") all(dim(p) == c(nd, n)) d <- units::set_units(c(0, 20, 40), "ft") p <- predict(dfuncObs, distances = d, type = "dfunc") all(dim(p) == c(3, n)) p <- predict(dfuncObs , newdata = data.frame(x = 1:5) , distances = d , type = "dfunc") all(dim(p) == c(3, 5)) # Covariates data(sparrowDfuncObserver) # pre-estimated object ## Not run: # Command to generate 'sparrowDfuncObserver' sparrowDfuncObserver <- sparrowDf |> dfuncEstim(formula = dist ~ observer , likelihood = "hazrate") ## End(Not run) predict(sparrowDfuncObserver) # n X 2 Observers <- data.frame(observer = levels(sparrowDf$observer)) predict(sparrowDfuncObserver, newdata = Observers) # 5 X 2 predict(sparrowDfuncObserver, type = "dfunc") # nd X n predict(sparrowDfuncObserver, newdata = Observers, type = "dfunc") # nd X 5 d <- units::set_units(c(0, 150, 400), "ft") predict(sparrowDfuncObserver , newdata = Observers , distances = d , type = "dfunc") # 3 X 5 # Density and abundance by transect predict(sparrowDfuncObserver , type = "density")
data("sparrowDf") # For dimension checks: nd <- getOption("Rdistance_intEvalPts") # No covariates dfuncObs <- sparrowDf |> dfuncEstim(formula = dist ~ 1 , w.hi = units::as_units(100, "m")) n <- nrow(dfuncObs$mf) p <- predict(dfuncObs) # parameters all(dim(p) == c(n, 1)) # values in newdata ignored because no covariates p <- predict(dfuncObs, newdata = data.frame(x = 1:5)) all(dim(p) == c(5, 1)) # Distance functions in columns, one per observation p <- predict(dfuncObs, type = "dfunc") all(dim(p) == c(nd, n)) d <- units::set_units(c(0, 20, 40), "ft") p <- predict(dfuncObs, distances = d, type = "dfunc") all(dim(p) == c(3, n)) p <- predict(dfuncObs , newdata = data.frame(x = 1:5) , distances = d , type = "dfunc") all(dim(p) == c(3, 5)) # Covariates data(sparrowDfuncObserver) # pre-estimated object ## Not run: # Command to generate 'sparrowDfuncObserver' sparrowDfuncObserver <- sparrowDf |> dfuncEstim(formula = dist ~ observer , likelihood = "hazrate") ## End(Not run) predict(sparrowDfuncObserver) # n X 2 Observers <- data.frame(observer = levels(sparrowDf$observer)) predict(sparrowDfuncObserver, newdata = Observers) # 5 X 2 predict(sparrowDfuncObserver, type = "dfunc") # nd X n predict(sparrowDfuncObserver, newdata = Observers, type = "dfunc") # nd X 5 d <- units::set_units(c(0, 150, 400), "ft") predict(sparrowDfuncObserver , newdata = Observers , distances = d , type = "dfunc") # 3 X 5 # Density and abundance by transect predict(sparrowDfuncObserver , type = "density")
An internal prediction function to predict (compute)
the values of distance functions at a set of observed values.
Unlike predDfuncs
, which evaluates distance
functions at EVERY input distance, this routine evaluates
distance functions at only ONE distance. This is what's
appropriate for likelihood computation.
This version allows for matrix inputs and
uses matrix operations, and is thus faster than earlier
looping versions.
predLikelihood(object, params)
predLikelihood(object, params)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
params |
A matrix of distance function parameters. Rows are observations, columns contain the set of parameters (canonical and expansion) for each observation. |
Assuming L
is the vector returned by this function,
the negative log likelihood is -sum(log(L / I), na.rm=T)
,
where I
is the integration constant, or
area under the likelihood between
w.lo
and w.hi
.
Note that returned likelihood values for distances less
than w.lo
or greater than w.hi
are NA
;
hence, na.rm=TRUE
in the sum.
A vector of distance function values, of length
n = number of observed distances = length(distances(x)).
Elements in distances(x)
correspond, in order,
to values in the returned vector.
Print an object of class c("abund","dfunc")
produced by abundEstim
.
## S3 method for class 'abund' print(x, ...)
## S3 method for class 'abund' print(x, ...)
x |
An object output by |
... |
Included for compatibility to other print methods. Ignored here. |
0 is invisibly returned
dfuncEstim
, abundEstim
,
summary.dfunc
, print.dfunc
,
summary.abund
# Load example sparrow data (line transect survey type) data(sparrowDf) # Fit half-normal detection function dfunc <- sparrowDf |> dfuncEstim(formula=dist~groupsize(groupsize)) # Estimate abundance given a detection function fit <- abundEstim(object = dfunc , area = units::set_units(4105, "km^2") , ci = NULL) print(fit) summary(fit) ## Not run: # Bootstrap confidence intervals (500 iterations) # Requires ~4 min fit <- abundEstim(object = dfunc , area = units::set_units(4105, "km^2") , ci = 0.95 , plot.bs = TRUE , showProgress = TRUE) print(fit) summary(fit) ## End(Not run)
# Load example sparrow data (line transect survey type) data(sparrowDf) # Fit half-normal detection function dfunc <- sparrowDf |> dfuncEstim(formula=dist~groupsize(groupsize)) # Estimate abundance given a detection function fit <- abundEstim(object = dfunc , area = units::set_units(4105, "km^2") , ci = NULL) print(fit) summary(fit) ## Not run: # Bootstrap confidence intervals (500 iterations) # Requires ~4 min fit <- abundEstim(object = dfunc , area = units::set_units(4105, "km^2") , ci = 0.95 , plot.bs = TRUE , showProgress = TRUE) print(fit) summary(fit) ## End(Not run)
Print method for distance function objects produced
by dfuncEstim
.
## S3 method for class 'dfunc' print(x, ...)
## S3 method for class 'dfunc' print(x, ...)
x |
An estimated detection function object, normally
produced by calling |
... |
Included for compatibility with other print methods. Ignored here. |
The input distance function (x
) is returned invisibly.
dfuncEstim
, plot.dfunc
,
print.abund
, summary.dfunc
# Load example sparrow data (line transect survey type) data(sparrowSiteData) data(sparrowDetectionData) # Fit half-normal detection function sparrowDf <- RdistDf(sparrowSiteData, sparrowDetectionData) dfunc <- sparrowDf |> dfuncEstim(formula=dist~1) dfunc
# Load example sparrow data (line transect survey type) data(sparrowSiteData) data(sparrowDetectionData) # Fit half-normal detection function sparrowDf <- RdistDf(sparrowSiteData, sparrowDetectionData) dfunc <- sparrowDf |> dfuncEstim(formula=dist~1) dfunc
Optimization control parameters
are set by calls to options()
(see examples).
Optimization parameters used in
Rdistance
are the following:
Rdist_maxIters
: The maximum number of optimization
iterations allowed.
Rdist_evalMax
: The maximum number of objective function
evaluations allowed.
Rdist_likeTol
: Minimum change in the likelihood
between iterations required optimization to continue.
If the likelihood changes by less than this amount,
optimization stops and a solution is declared. Iteration
continues when likelihood changes exceed this value.
Rdist_coefTol
: Minimum change in model coefficients
between iterations for optimization to continue.
If the sum of squared coefficient differences changes
by less than this amount between iterations,
optimization stops and a solution is declared.
Rdist_optimizer
: A string specifying the optimizer
to use. Results can vary between optimizers, so
switching algorithms sometimes makes a poorly
behaved distance function converge. Valid
values are "optim" which uses optim::optim
,
and "nlminb" which uses stats:nlminb
. The authors
have had better luck with "nlminb" than "optim" and "nlminb"
runs noticeably faster. Problems with solutions near, but not on,
parameter boundaries may require use of "optim".
Rdist_hessEps
: A vector of parameter distances used during
computation of numeric second derivatives. These distances control
and determine variance estimates, and they may need revision when
the maximum likelihood solution is near a parameter boundary.
Should have length
1 or the number of parameters in the model. See function
secondDeriv
for further details.
Rdist_requireUnits
: A logical specifying whether measurement
units are required on distances and areas. If TRUE,
measurement units are required on off-transect and radial
distances in the input data frame. Likewise, measurement
units are required on truncation distances, scale location,
transect lengths, and study area size. If FALSE, no units are
required and input values are used as is. The FALSE options is
provided for rare cases when Rdistance
functions are called
from other functions and the calling functions do not accommodate
units.
Assign units with statement like units(detectionDf$dist) <- "m"
or units::set_units(w.hi, "km")
. Measurement units of
the various physical quantities need not
be equal because appropriate conversions occur internally.
An error is thrown if differing units are not compatible.
For example, "m" (meters) cannot be converted into "ha" (hectares),
but "acres" can be converted into "ha".
Rdistance
recognizes units listed in units::valid_udunits()
.
Rdist_maxBSFailPropForWarning
: The proportion of bootstrap
iterations that can fail without a warning. If the proportion
of non-convergent bootstrap iterations exceeds this
parameter, a warning about the validity of CI's is issued in
the abundance print method.
# increase number of iterations options(Rdist_maxIters=2000) # change optimizer and decrease tolerance options(list(Rdist_optimizer="optim", Rdist_likeTol=1e-6))
# increase number of iterations options(Rdist_maxIters=2000) # change optimizer and decrease tolerance options(list(Rdist_optimizer="optim", Rdist_likeTol=1e-6))
Makes an Rdistance
data frame from
separate transect and detection
data frames. Rdistance
data frames are nested
data frames with one row per transect. Detection
information for each transect appears in a list-based
column that itself contains a
data frame. See Rdistance Data Frames.
Rdistance
data frames can be constructed using calls to
dplyr::nest_by
and dplyr::right_jion
, with subsequent
attribute assignment (see Examples). This routine is
a convenient wrapper for those calls.
RdistDf( transectDf, detectionDf, by = NULL, pointSurvey = FALSE, observer = "single", .detectionCol = "detections", .effortCol = NULL )
RdistDf( transectDf, detectionDf, by = NULL, pointSurvey = FALSE, observer = "single", .detectionCol = "detections", .effortCol = NULL )
transectDf |
A data frame with one row per transect and
columns containing information about the entire transect.
At a minimum, this data frame must contain the transect's ID so
it can be merged with |
detectionDf |
A data frame containing detection information associated with each transect. At a minimum, each row of this data frame must contain the following:
Optional columns in 'detectionDf':
|
by |
A character vector of variables to use in the join. The right-hand
side of this join identifies unique transects and will specify unique
rows in both |
pointSurvey |
If TRUE, observations were made from discrete points (i.e., during a point-transect survey) and distances are radial from observation point to target. If FALSE, observations were made along a continuous transect (i.e., during a line-transect survey) and distances are from target to nearest point on the transect (i.e., perpendicular to transect). |
observer |
Type of observer system. Legal values are "single" for single observer systems, "1given2" for a double observer system wherein observations made by observer 1 are tested against observations made by observer 2, "2given1" for a double observer system wherein observations made by observer 2 are tested against observations made by observer 1, and "both" for a double observer system wherein observations made by both observers are tested against the other and combined. |
.detectionCol |
Name of the list column that will contain detection data frames. Default name is "detections". Detection distances (LHS of 'dfuncEstim' formula) and group sizes are normally columns in the nested detection data frames embedded in '.detectionCol'. |
.effortCol |
For continuous line transects,
this specifies the name of a column in |
For valid bootstrap estimates of confidence intervals (computed in abundEstim
),
each row of the nested data frame must represent one transect (more generally,
one sampling unit), and none should
be duplicated. The combination of transect columns
in by
(i.e., the LHS of the merge, or "a" and "b" of
by = c("a" = "d", "b" = "c")
for example)
should specify unique transects and unique rows of
transectDf
. Warning: If by
does not specify unique rows of transectDf
, dplyr::left_join
,
which is called internally, will perform a many-to-many merge without
warning, and this normally duplicates both
transects and detections.
A nested tibble (a generalization of base data frames)
with one row per transect, and detection
information in a list column. Technically, the return is
a grouped tibble
from
the tibble
package with one row per group, and a list
column containing detection information.
Survey type, observer system, and name of the effort column
are recorded
as attributes (transType
, obsType
, and effortColumn
, respectfully).
The return prints nicely using methods
in package tibble
. If returned objects print strangely,
attach library tibble
. A summary method tailored to distance sampling
is available (i.e., summary(return)
).
RdistDf
data frames contain the following information:
Transect Information: Each row of the data frame contains transect id and effort. Effort is transect length for line-transects, and number of points for point-transects. Optionally, transect-level covariates (such as habitat or observer id) appear on each row.
Detection Information: Observation distances (either perpendicular off-transect or radial off-point) appear in a data frame stored in a list column. If detected groups occasionally included more than one target, a group size column must be present in the list-column data frame. Optionally, detection-level covariates (such as sex or size) can appear in the data frame of the list column.
Distance Type: The type of observation distances, either
perpendicular off-transect (for line-transects studies) or
radial off-point (for point-transect studies) must appear as an
attribute of RdistDf
's.
Observer Type: The type of observation system used, either
single observer or one of three types of multiple observer systems, must
appear as an attribute of RdistDf
's.
Line-transects are continuous paths with targets detectable at any point. Point transects consist of one or more discrete points along a path from which observers search for targets. The length of a line-transect is it's physical length (e.g., km or miles). The 'length' of a point transect is the number of points along the transect. Single points are considered transects of length one. The length of line-transects must have a physical measurement unit (e.g., 'm' or 'ft'). The length of point-transects must be a unit-less integers (i.e., number of points).
As of Rdistance
version 3.0.0, measurement units are
require on all physical distances.
Requiring units ensures that internal calculations and results
(e.g., ESW and abundance) are correct
and that output units are clear.
Physical distances are required on
off-transect distances, radial distances, truncation distances
(w.lo
, unless it is zero; and w.hi
, unless it is NULL),
scale locations (x.scl
, unless it is zero),
line-transect lengths, and study area size. All units are
1-dimensional except those on study area, which are 2-dimensional.
Physical measurement units can vary. For example,
off-transect distances can be meters ("m"), w.hi
can be inches ("in"),
and w.lo
can be kilometers ("km"). Internally, all distances are
converted to the units specified by outputUnits
(or the units of input distances if
outputUnits
is NULL), and
all output is reported
in units of outputUnits
. Valid conversions must exist between
units or an error is thrown. For example, meters cannot be converted
into hectares.
Measurement units can be assigned using
units()<-
after attaching the units
package or with x <- units::set_units(x, "<units>")
.
See units::valid_udunits()
for a list of valid symbolic units.
If measurements are truly unit-less, or measurement units are unknown,
set options(Rdist_requireUnits = FALSE)
. This suppresses
all unit checks and conversions. Users are on their own
to make sure inputs are scaled correctly and that output units are known.
is.RdistDf
: check validity of RdistDf data frames;
dfuncEstim
: estimate distance function.
data(sparrowSiteData) data(sparrowDetectionData) sparrowDf <- RdistDf( sparrowSiteData, sparrowDetectionData ) is.RdistDf(sparrowDf, verbose = T) summary(sparrowDf) summary(sparrowDf , formula = dist ~ groupsize(groupsize) , w.hi = units::set_units(100, "m")) # Equivalent to above: sparrowDf <- sparrowDetectionData |> dplyr::nest_by( siteID , .key = "detections") |> dplyr::right_join(sparrowSiteData, by = "siteID") attr(sparrowDf, "detectionColumn") <- "detections" attr(sparrowDf, "effortColumn") <- "length" attr(sparrowDf, "obsType") <- "single" attr(sparrowDf, "transType") <- "line" is.RdistDf(sparrowDf, verbose = T) summary(sparrowDf, formula = dist ~ groupsize(groupsize)) # Condensed view: 1 row per transect (make sure tibble is attached) sparrowDf # Expansion methods: # (1) use Rdistance::unnest (includes zero transects) df1 <- unnest(sparrowDf) any( df1$siteID == "B2" ) # TRUE # Use tidyr::unnest(); but, no zero transects df2 <- tidyr::unnest(sparrowDf, cols = "detections") any( df2$siteID == "B2" ) # FALSE # Use dplyr::reframe for specific transects (e.g., for transect "B3") sparrowDf |> dplyr::filter(siteID == "B3") |> dplyr::reframe(detections) # Count detections per transect (can't use dplyr::if_else) df3 <- sparrowDf |> dplyr::reframe(nDetections = ifelse(is.null(detections), 0, nrow(detections))) sum(df3$nDetections) # Number of detections sum(df3$nDetections == 0) # Number of zero transects # Point transects data(thrasherDetectionData) data(thrasherSiteData) thrasherDf <- RdistDf( thrasherSiteData , thrasherDetectionData , pointSurvey = TRUE , by = "siteID" , .detectionCol = "detections") summary(thrasherDf, formula = dist ~ groupsize(groupsize))
data(sparrowSiteData) data(sparrowDetectionData) sparrowDf <- RdistDf( sparrowSiteData, sparrowDetectionData ) is.RdistDf(sparrowDf, verbose = T) summary(sparrowDf) summary(sparrowDf , formula = dist ~ groupsize(groupsize) , w.hi = units::set_units(100, "m")) # Equivalent to above: sparrowDf <- sparrowDetectionData |> dplyr::nest_by( siteID , .key = "detections") |> dplyr::right_join(sparrowSiteData, by = "siteID") attr(sparrowDf, "detectionColumn") <- "detections" attr(sparrowDf, "effortColumn") <- "length" attr(sparrowDf, "obsType") <- "single" attr(sparrowDf, "transType") <- "line" is.RdistDf(sparrowDf, verbose = T) summary(sparrowDf, formula = dist ~ groupsize(groupsize)) # Condensed view: 1 row per transect (make sure tibble is attached) sparrowDf # Expansion methods: # (1) use Rdistance::unnest (includes zero transects) df1 <- unnest(sparrowDf) any( df1$siteID == "B2" ) # TRUE # Use tidyr::unnest(); but, no zero transects df2 <- tidyr::unnest(sparrowDf, cols = "detections") any( df2$siteID == "B2" ) # FALSE # Use dplyr::reframe for specific transects (e.g., for transect "B3") sparrowDf |> dplyr::filter(siteID == "B3") |> dplyr::reframe(detections) # Count detections per transect (can't use dplyr::if_else) df3 <- sparrowDf |> dplyr::reframe(nDetections = ifelse(is.null(detections), 0, nrow(detections))) sum(df3$nDetections) # Number of detections sum(df3$nDetections == 0) # Number of zero transects # Point transects data(thrasherDetectionData) data(thrasherSiteData) thrasherDf <- RdistDf( thrasherSiteData , thrasherDetectionData , pointSurvey = TRUE , by = "siteID" , .detectionCol = "detections") summary(thrasherDf, formula = dist ~ groupsize(groupsize))
Computes numeric second derivatives (hessian) of an arbitrary multidimensional function at a particular location.
secondDeriv(x, FUN, eps = 1e-08, ...)
secondDeriv(x, FUN, eps = 1e-08, ...)
x |
The location (a vector) where the second derivatives
of |
FUN |
An R function for which the second derivatives are
sought.
This must be a function of the form FUN <- function(x, ...){...}
where x is a vector of variable parameters to FUN at which
to evaluate the 2nd derivative,
and ... are additional parameters needed to evaluate the function.
FUN must return a single value (scalar), the height of the
surface above |
eps |
A vector of small relative
distances to add to One might want to change |
... |
Any arguments passed to |
This function uses the "5-point" numeric second derivative
method advocated in numerous numerical recipe texts. During computation
of the 2nd derivative, FUN must be
capable of being evaluated at numerous locations within a hyper-ellipsoid
with cardinal radii 2*x
*(eps
)^0.25 = 0.02*x
at the
default value of eps
.
A handy way to use this function is to call an optimization routine
like nlminb
with FUN, then call this function with the
optimized values (solution) and FUN. This will yield the hessian
at the solution and this is can produce a better
estimate of the variance-covariance
matrix than using the hessian returned by some optimization routines.
Some optimization routines return the hessian evaluated
at the next-to-last step of optimization.
An estimate of the variance-covariance matrix, which is used in
Rdistance
, is solve(hessian)
where hessian
is
secondDeriv(<parameter estimates>, <likelihood>)
.
func <- function(x){-x*x} # second derivative should be -2 secondDeriv(0,func) secondDeriv(3,func) func <- function(x){3 + 5*x^2 + 2*x^3} # second derivative should be 10+12x secondDeriv(0,func) secondDeriv(2,func) func <- function(x){x[1]^2 + 5*x[2]^2} # should be rbind(c(2,0),c(0,10)) secondDeriv(c(1,1),func)
func <- function(x){-x*x} # second derivative should be -2 secondDeriv(0,func) secondDeriv(3,func) func <- function(x){3 + 5*x^2 + 2*x^3} # second derivative should be 10+12x secondDeriv(0,func) secondDeriv(2,func) func <- function(x){x[1]^2 + 5*x[2]^2} # should be rbind(c(2,0),c(0,10)) secondDeriv(c(1,1),func)
Computes simple polynomial expansion terms used in the likelihood of a distance analysis. More generally, will compute polynomial expansions of any numeric vector.
simple.expansion(x, expansions)
simple.expansion(x, expansions)
x |
In a distance analysis, |
expansions |
A scalar specifying the number of expansion terms to compute. Must be one of the integers 1, 2, 3, or 4. |
The polynomials computed here are:
First term:
Second term:
Third term:
Fourth term:
The maximum number of expansion terms computed is 4.
A matrix of size length(x)
X expansions
. The columns of this matrix are
the Simple polynomial expansions of x
. Column 1 is the first expansion term of x
,
column 2 is the second x
, and so on up to expansions
.
dfuncEstim
, cosine.expansion
, hermite.expansion
x <- seq(0, 1, length = 200) simp.expn <- simple.expansion(x, 4) plot(range(x), range(simp.expn), type="n") matlines(x, simp.expn, col=rainbow(4), lty = 1)
x <- seq(0, 1, length = 200) simp.expn <- simple.expansion(x, 4) plot(range(x), range(simp.expn), type="n") matlines(x, simp.expn, col=rainbow(4), lty = 1)
Detection data from line transect surveys for Brewer's sparrow on 72 transects located on a 4105 km^2 study area in central Wyoming. Data were collected by Dr. Jason Carlisle of the Wyoming Cooperative Fish & Wildlife Research Unit in 2012. Each transect was 500 meters long.
A data.frame containing 356 rows and 5 columns. Each row represents a detected group of sparrows. Column descriptions:
siteID
: Factor (72 levels), the site or transect where the detection
was made.
groupsize
: Number, the number of individuals within
the detected group.
sightdist
: Number, distance (m) from the observer to the detected group.
sightangle
: Number, the angle (degrees) from the transect
line to the detected group.
dist
: Number, the perpendicular, off-transect distance (m) from the
transect to the detected group. This is the distance used in analysis.
Calculated using perpDists
.
The Brewer's sparrow data are a subset of the data collected by Jason Carlisle and various field technicians for his Ph.D. from the Department of Ecology, University of Wyoming, in 2017. This portion of Jason's work was funded by the Wyoming Game and Fish Department through agreements with the University of Wyoming's Cooperative Fish & Wildlife Research Unit (2012).
Carlisle, J.D. 2017. The effect of sage-grouse conservation on wildlife species of concern: implications for the umbrella species concept. Dissertation. University of Wyoming, Laramie, Wyoming, USA.
Carlisle, J. D., and A. D. Chalfoun. 2020. The abundance of Greater Sage-Grouse as a proxy for the abundance of sagebrush-associated songbirds in Wyoming, USA. Avian Conservation and Ecology 15(2):16. doi:10.5751/ACE-01702-150216
Detection data from line transect surveys for Brewer's sparrow on 72 transects located on a 4105 km^2 study area in central Wyoming collected by Dr. Jason Carlisle as part of his graduate work in the Wyoming Cooperative Fish & Wildlife Research Unit in 2012. Each transect was 500 meters long.
A rowwise tibble containing 72 rows and 9 columns, one of which
is nested data frame of detections. Each row represents
one transect. The embedded data frame in column detections
contains the detections made on the transect represented on that row.
Column descriptions:
siteID
: Factor (72 levels), the transect identifier
for that row of the data frame.
length
: The length, in meters [m], of each transect.
observer
: Identity of the observer who surveyed the transect.
bare
: The mean bare ground cover (%) within 100 [m] of the transect.
herb
: The mean herbaceous cover (%) within 100 [m] of the transect.
shrub
: The mean shrub cover (%) within 100 [m] of the transect.
height
: The mean shrub height [cm] within 100 [m] of the transect.
shrubclass
: Shrub class factor. Either "Low"" when
shrub cover is < 10%, or "High" if cover >= 10%.
The embedded data frame in column detections
contains the following
variables:
groupsize
: The number of individuals in the detected group.
sightdist
: Distance [m] from observer to the detected group.
sightangle
: Angle [degrees] from the transect
line to the detected group. Not bearing. Range 0 [degrees] to 90 [degrees].
dist
: Perpendicular, off-transect distance [m], from the
transect to the detected group. This is the distance used in analysis.
Calculated using perpDists
.
The Brewer's sparrow data are a subset of data collected by Jason Carlisle and various field technicians for his Ph.D. from the Department of Ecology, University of Wyoming, in 2017. This portion of Jason's work was funded by the Wyoming Game and Fish Department through agreements with the University of Wyoming's Cooperative Fish & Wildlife Research Unit (2012).
Carlisle, J.D. 2017. The effect of sage-grouse conservation on wildlife species of concern: implications for the umbrella species concept. Dissertation. University of Wyoming, Laramie, Wyoming, USA.
Carlisle, J. D., and A. D. Chalfoun. 2020. The abundance of Greater Sage-Grouse as a proxy for the abundance of sagebrush-associated songbirds in Wyoming, USA. Avian Conservation and Ecology 15(2):16. doi:10.5751/ACE-01702-150216
sparrowSiteData
, sparrowDetectionData
,
RdistDf
## Not run: # The following code generated 'sparrowDf' data(sparrowDetectionData) data(sparrowSiteData) sparrowDf <- RdistDf(transectDf = sparrowSiteData , detectionDf = sparrowDetectionData , by = "siteID" , pointSurvey = FALSE , .effortCol = "length" ) ## End(Not run) data(sparrowDf) tidyr::unnest(sparrowDf, detections) # only non-zero transects Rdistance::unnest(sparrowDf) # with zero transects at the bottom summary(sparrowDf, formula = dist ~ groupsize(groupsize) )
## Not run: # The following code generated 'sparrowDf' data(sparrowDetectionData) data(sparrowSiteData) sparrowDf <- RdistDf(transectDf = sparrowSiteData , detectionDf = sparrowDetectionData , by = "siteID" , pointSurvey = FALSE , .effortCol = "length" ) ## End(Not run) data(sparrowDf) tidyr::unnest(sparrowDf, detections) # only non-zero transects Rdistance::unnest(sparrowDf) # with zero transects at the bottom summary(sparrowDf, formula = dist ~ groupsize(groupsize) )
Pre-estimated Brewer's sparrow detection function that included and 'observer' effect. Included to speed up example execution times. See 'Examples'.
An estimated distance function object with
class 'dfunc'. See 'Value' section of
dfuncEstim
for
description of components.
sparrowSiteData
and
sparrowDetectionData
for description of the data
## Not run: # the following code was used to generate 'sparrowDfuncObserver' data(sparrowDf) sparrowDfuncObserver <- sparrowDf |> dfuncEstim(formula = dist ~ observer , likelihood = "hazrate") ## End(Not run)
## Not run: # the following code was used to generate 'sparrowDfuncObserver' data(sparrowDf) sparrowDfuncObserver <- sparrowDf |> dfuncEstim(formula = dist ~ observer , likelihood = "hazrate") ## End(Not run)
Site data from line transect surveys for Brewer's sparrow on 72 transects located on a 4105 km^2 study area in central Wyoming. Data were collected by Dr. Jason Carlisle of the Wyoming Cooperative Fish & Wildlife Research Unit in 2012. Each transect was 500 meters long.
A data.frame containing 72 rows and 8 columns. Each row represents a site (transect) surveyed. Column descriptions:
siteID
: Factor (72 levels), the site or transect surveyed.
length
: Number, the length (m) of each transect.
observer
: Factor (five levels), identity of the observer
who surveyed the transect.
bare
: Number, the mean bare ground cover (%) within 100 m of each transect.
herb
: Number, the mean herbaceous cover (%) within 100 m of
each transect.
shrub
: Number, the mean shrub cover (%) within
100 m of each transect.
height
: Number, the mean shrub height
(cm) within 100 m of each transect.
shrubclass
: Factor (two levels), shrub class is "Low"" when
shrub cover is < 10%, "High" otherwise.
The Brewer's sparrow data are a subset of the data collected by Jason Carlisle and various field technicians for his Ph.D. from the Department of Ecology, University of Wyoming, in 2017. This portion of Jason's work was funded by the Wyoming Game and Fish Department through agreements with the University of Wyoming's Cooperative Fish & Wildlife Research Unit (2012).
Carlisle, J.D. 2017. The effect of sage-grouse conservation on wildlife species of concern: Implications for the umbrella species concept. Dissertation. University of Wyoming, Laramie, Wyoming, USA.
Carlisle, J. D., and A. D. Chalfoun. 2020. The abundance of Greater Sage-Grouse as a proxy for the abundance of sagebrush-associated songbirds in Wyoming, USA. Avian Conservation and Ecology 15(2):16. doi:10.5751/ACE-01702-150216
Returns starting values and limits (boundaries) for the parameters of
distance functions. This function is called by
other routines in Rdistance
, and is not intended to
be called by the user. Replace this function in the global
environment to change boundaries and starting values.
startLimits(ml)
startLimits(ml)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
A list containing the following components
start |
Vector of starting values for parameters of the likelihood and expansion terms. |
lowlimit |
Vector of lower limits for the likelihood parameters and expansion terms. |
uplimit |
Vector of upper limits for the likelihood parameters and expansion terms. |
names |
Vector of names for the likelihood parameters and expansion terms. |
The length of each vector in the return is:
(Num expansions) + 1 + 1*(like %in% c("hazrate")) + (Num Covars)
.
data(sparrowDf) # Half-normal start limits modList <- parseModel( data = sparrowDf , formula = dist ~ 1 , likelihood = "halfnorm" ) startLimits(modList) # Half-normal with expansions modList <- parseModel( data = sparrowDf , formula = dist ~ 1 , likelihood = "halfnorm" , expansions = 3 ) startLimits(modList) # Hazard rate start limits modList$likelihood <- "hazrate" startLimits(modList) # Neg exp start limits modList$likelihood <- "negexp" startLimits(modList)
data(sparrowDf) # Half-normal start limits modList <- parseModel( data = sparrowDf , formula = dist ~ 1 , likelihood = "halfnorm" ) startLimits(modList) # Half-normal with expansions modList <- parseModel( data = sparrowDf , formula = dist ~ 1 , likelihood = "halfnorm" , expansions = 3 ) startLimits(modList) # Hazard rate start limits modList$likelihood <- "hazrate" startLimits(modList) # Neg exp start limits modList$likelihood <- "negexp" startLimits(modList)
Summarize an object of class c("abund","dfunc")
that is output by abundEstim
.
## S3 method for class 'abund' summary(object, criterion = "AICc", ...)
## S3 method for class 'abund' summary(object, criterion = "AICc", ...)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
criterion |
A string specifying the model fit criterion to print.
Must be one of "AICc" (the default),
"AIC", or "BIC". See |
... |
Included for compatibility to other print methods. Ignored here. |
If the proportion of bootstrap iterations that failed is
greater than getOption("Rdistance_maxBSFailPropForWarning")
,
a warning about the validity of CI's is issued and
a diagnostic message printed. Increasing this option to a number greater
than 1 will kill the warning (e.g., options(Rdistance_maxBSFailPropForWarning = 1.3)
),
but ignoring a large number of non-convergent
bootstrap iterations may be a bad idea (i.e., validity of the CI is
questionable). The default value for Rdistance_maxBSFailPropForWarning
is 0.2.
0 is invisibly returned.
dfuncEstim
, abundEstim
,
summary.dfunc
, print.dfunc
,
print.abund
# Load example sparrow data (line transect survey type) data(sparrowDf) # Fit half-normal detection function dfunc <- sparrowDf |> dfuncEstim(formula=dist ~ 1 + offset(groupsize)) # Estimate abundance given the detection function fit <- abundEstim(dfunc , area = units::set_units(4105, "km^2") , ci=NULL) summary(fit) # No confidence intervals ## Not run: # With bootstrap confidence intervals # Requires ~3 min to complete fit <- abundEstim(dfunc , area = units::set_units(4105, "km^2") , ci=0.95) summary(fit) ## End(Not run)
# Load example sparrow data (line transect survey type) data(sparrowDf) # Fit half-normal detection function dfunc <- sparrowDf |> dfuncEstim(formula=dist ~ 1 + offset(groupsize)) # Estimate abundance given the detection function fit <- abundEstim(dfunc , area = units::set_units(4105, "km^2") , ci=NULL) summary(fit) # No confidence intervals ## Not run: # With bootstrap confidence intervals # Requires ~3 min to complete fit <- abundEstim(dfunc , area = units::set_units(4105, "km^2") , ci=0.95) summary(fit) ## End(Not run)
A summary method for distance functions.
Distance functions are produced by
dfuncEstim
(class dfunc
).
## S3 method for class 'dfunc' summary(object, criterion = "AICc", ...)
## S3 method for class 'dfunc' summary(object, criterion = "AICc", ...)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
criterion |
A string specifying the model fit criterion to print.
Must be one of "AICc" (the default),
"AIC", or "BIC". See |
... |
Included for compatibility with other print methods. Ignored here. |
This function prints the following quantities:
‘Call’ : The original function call.
‘Coefficients’ : A matrix of estimated coefficients, their standard errors, and Wald Z tests.
‘Strip’ : The left (w.lo
) and right (w.hi
) truncation values.
‘Effective strip width or detection radius’ : ESW or EDR as computed by effectiveDistance
.
‘Probability of Detection’ : Probability of detecting a single target in the strip.
‘Scaling’ : The horizontal and vertical coordinates used to scale the distance function. Usually, the horizontal coordinate is 0 and the vertical coordinate is 1 (i.e., g(0) = 1).
‘Log likelihood’ : Value of the maximized log likelihood.
‘Criterion’ : Value of the specified fit criterion (AIC, AICc, or BIC).
The number of digits used in the printout is
controlled by options()$digits
.
The input distance function object (object
), invisibly,
with the following additional components:
convMessage
: The convergence message. If the distance function
is smoothed, the convergence message is NULL.
effDistance
: The ESW or EDR.
pDetect
: Probability of detection in the strip.
AIC
: AICc, AIC, or BIC of the fit, whichever was requested.
coefficients
: If the distance function has coefficients, this
is the coefficient matrix with standard errors, Wald Z values, and p values.
If the distance function is smoothed, it has no coefficients and this component
is NULL.
dfuncEstim
, plot.dfunc
,
print.abund
, print.abund
# Load example sparrow data (line transect survey type) data(sparrowDf) # Fit half-normal detection function dfunc <- sparrowDf |> dfuncEstim(formula=dist~1) # Print results summary(dfunc) summary(dfunc, criterion="BIC")
# Load example sparrow data (line transect survey type) data(sparrowDf) # Fit half-normal detection function dfunc <- sparrowDf |> dfuncEstim(formula=dist~1) # Print results summary(dfunc) summary(dfunc, criterion="BIC")
Summary method for distance sampling data frames.
Rdistance
data frames are rowwise tibbles. This routine is a
replacement summary method for rowwise_df's that
provides useful distance sampling descriptive statistics.
## S3 method for class 'rowwise_df' summary(object, formula = NULL, w.lo = 0, w.hi = NULL, ...)
## S3 method for class 'rowwise_df' summary(object, formula = NULL, w.lo = 0, w.hi = NULL, ...)
object |
An |
formula |
A standard formula object. For example, |
w.lo |
Lower or left-truncation limit of the distances in distance data.
This is the minimum possible off-transect distance. Default is 0. If
|
w.hi |
Upper or right-truncation limit of the distances
in |
... |
Other arguments for summary methods. |
If object
is an RdistDf
, a data frame
containing summary statistics relevant to distance sampling is returned
invisibly.
If formula
is not specified, the number of distance observations
and target detections is not returned because the distances, group sizes,
and covariates are not known.
If object
is not an Rdistance data frame, return is the result of
the next summary method.
data(thrasherDf) summary(thrasherDf) summary(thrasherDf , formula = dist ~ groupsize(groupsize) , w.hi = units::set_units(100,"m") )
data(thrasherDf) summary(thrasherDf) summary(thrasherDf , formula = dist ~ groupsize(groupsize) , w.hi = units::set_units(100,"m") )
Point transect data collected in central Wyoming from 120 points surveyed for Sage Thrashers by the Wyoming Cooperative Fish & Wildlife Research Unit in 2013.
A data.frame containing 193 rows and 3 columns. Each row represents a detected group of thrashers. Column descriptions:
siteID
: Factor (120 levels), the site or point where the detection
was made.
groupsize
: Number, the number of individuals within
the detected group.
dist
: Number, the radial distance (m) from
the transect to the detected group. This is the distance used in analysis.
The Sage Thrasher data are a subset of the data collected by Jason Carlisle and various field technicians for his Ph.D. from the Department of Ecology, University of Wyoming, in 2017. This portion of Jason's work was funded by the Wyoming Game and Fish Department through agreements with the University of Wyoming's Cooperative Fish & Wildlife Research Unit (2012).
Carlisle, J.D. 2017. The effect of sage-grouse conservation on wildlife species of concern: implications for the umbrella species concept. Dissertation. University of Wyoming, Laramie, Wyoming, USA.
Carlisle, J. D., A. D. Chalfoun, K. T. Smith, and J. L. Beck. 2018. Nontarget effects on songbirds from habitat manipulation for Greater Sage-Grouse: Implications for the umbrella species concept. The Condor: Ornithological Applications 120:439–455. doi:10.1650/CONDOR-17-200.1
Point transect data collected in central Wyoming on 120 points surveyed for Sage Thrashers by the Wyoming Cooperative Fish & Wildlife Research Unit in 2013.
A rowwise tibble containing 120 rows and 8 columns, one of which (i.e., 'detections') contains nested data frames of detections. Each row represents one transect of one point.
A data.frame containing 120 rows and 6 columns. Each row represents a surveyed site. Each surveyed site is considered one transect of one point. Column descriptions:
siteID
: Factor (120 levels), the site or point surveyed.
detections
: An embedded (nested) data frame containing
detections made at that point. Columns in the embedded data frame contain:
groupsize
: The number of individuals in
the detected group.
dist
: The radial distance (m) from
the transect to the detected group.
observer
: Factor (six levels), identity of the observer who surveyed
the point.
bare
: Number, the mean bare ground cover (%)
within 100 m of each point.
herb
: Number, the mean herbaceous
cover (%) within 100 m of each point.
shrub
: Number, the mean
shrub cover (%) within 100 m of each point.
height
: Number,
the mean shrub height (cm) within 100 m of each point.
npoints
: The number of point counts on the transect.
The sage thrasher data are a subset of data collected by Jason Carlisle and various field technicians for his Ph.D. from the Department of Ecology, University of Wyoming, in 2017. This portion of Jason's work was funded by the Wyoming Game and Fish Department through agreements with the University of Wyoming's Cooperative Fish & Wildlife Research Unit (2012).
Carlisle, J.D. 2017. The effect of sage-grouse conservation on wildlife species of concern: implications for the umbrella species concept. Dissertation. University of Wyoming, Laramie, Wyoming, USA.
Carlisle, J. D., A. D. Chalfoun, K. T. Smith, and J. L. Beck. 2018. Nontarget effects on songbirds from habitat manipulation for Greater Sage-Grouse: Implications for the umbrella species concept. The Condor: Ornithological Applications 120:439–455. doi:10.1650/CONDOR-17-200.1
thrasherSiteData
, thrasherDetectionData
,
RdistDf
data(thrasherDf) is.RdistDf(thrasherDf) summary(thrasherDf, formula = dist ~ groupsize(groupsize) )
data(thrasherDf) is.RdistDf(thrasherDf) summary(thrasherDf, formula = dist ~ groupsize(groupsize) )
Point transect data collected in central Wyoming from 120 points surveyed for Sage Thrashers by the Wyoming Cooperative Fish & Wildlife Research Unit in 2013.
A data.frame containing 120 rows and 6 columns. Each row represents a surveyed site (point). Column descriptions:
siteID
: Factor (120 levels), the site or point surveyed.
observer
: Factor (six levels), identity of the observer who surveyed
the point.
bare
: Number, the mean bare ground cover (%)
within 100 m of each point.
herb
: Number, the mean herbaceous
cover (%) within 100 m of each point.
shrub
: Number, the mean
shrub cover (%) within 100 m of each point.
height
: Number,
the mean shrub height (cm) within 100 m of each point.
The Sage Thrasher data are a subset of data collected by Jason Carlisle and field technicians for his Ph.D. from the Department of Ecology, University of Wyoming, in 2017. This portion of Jason's work was funded by the Wyoming Game and Fish Department through agreements with the University of Wyoming's Cooperative Fish & Wildlife Research Unit (2012).
Carlisle, J.D. 2017. The effect of sage-grouse conservation on wildlife species of concern: implications for the umbrella species concept. Dissertation. University of Wyoming, Laramie, Wyoming, USA.
Carlisle, J. D., A. D. Chalfoun, K. T. Smith, and J. L. Beck. 2018. Nontarget effects on songbirds from habitat manipulation for Greater Sage-Grouse: Implications for the umbrella species concept. The Condor: Ornithological Applications 120:439–455. doi:10.1650/CONDOR-17-200.1
Return the type of transects represented in either a fitted distance function or Rdistance data frame.
transectType(x)
transectType(x)
x |
Either an estimated distance function, output by
|
This function is a simple helper function. If x
is an
estimated distance object, it polls the transType
attribute
of x
's Rdistance nested data frame.
If x
is an Rdistance nested data frame, it
polls the transType
attribute.
A scalar. Either 'line' if x
contains
continuous line-transect detections, or 'point' if x
contains
point-transects detections. If transect type has not been assigned,
return is NULL.
Unnest an RdistDf data frame by expanding the embedded 'detections' column. This unnest includes the so-called zero transects (transects without detections).
unnest(data, ...)
unnest(data, ...)
data |
An |
... |
Additional arguments passed to |
An expanded data frame, without embedded data frames. Rows in the return represent with one detection or one transect. If multiple detections were made on one transect, the transect will appear on multiple rows. If no detections were made on a transect, it will appear on one row with NA detection distance.
data('sparrowDf') # tidyr::unnest() does not include zero transects detectionDf <- tidyr::unnest(sparrowDf, detections) nrow(detectionDf) any(detectionDf$siteID == "B2") # Rdistance::unnest() includes zero transects fullDf <- unnest(sparrowDf) nrow(fullDf) any(fullDf$siteID == "B2")
data('sparrowDf') # tidyr::unnest() does not include zero transects detectionDf <- tidyr::unnest(sparrowDf, detections) nrow(detectionDf) any(detectionDf$siteID == "B2") # Rdistance::unnest() includes zero transects fullDf <- unnest(sparrowDf) nrow(fullDf) any(fullDf$siteID == "B2")