| Title: | Density and Abundance from Distance-Sampling Surveys |
|---|---|
| Description: | Distance-sampling (<doi:10.1007/978-3-319-19219-2>) is a field survey and analytical method that estimates density and abundance of survey targets (e.g., animals) when detection probability declines with observation distance. Distance-sampling is popular in ecology, especially when survey targets are observed from aerial platforms (e.g., airplane or drone), surface vessels (e.g., boat or truck), or along walking transects. Analysis involves fitting smooth (parametric) curves to histograms of observation distances and using those functions to adjust density estimates for missed targets. Routines included here fit curves to observation distance histograms, estimate effective sampling area, density of targets in surveyed areas, and the abundance of targets in a surrounding study area. Confidence interval estimation uses built-in bootstrap resampling. Help files are extensive and have been vetted by multiple authors. Many tutorials are available on the package's website (URL below). |
| Authors: | Trent McDonald [cre, aut], Jason Carlisle [aut], Aidan McDonald [aut] (point transect methods), Ryan Nielson [ctb] (smoothed likelihood), Ben Augustine [ctb] (maximization method), James Griswald [ctb] (maximization method), Patrick McKann [ctb] (maximization method), Lacey Jeroue [ctb] (vignettes), Hoffman Abigail [ctb] (vignettes), Kleinsausser Michael [ctb] (vignettes), Joel Reynolds [ctb] (Gamma likelihood), Pham Quang [ctb] (Gamma likelihood), Earl Becker [ctb] (Gamma likelihood), Aaron Christ [ctb] (Gamma likelihood), Brook Russelland [ctb] (Gamma likelihood), Stefan Emmons [ctb] (Automated tests), Will McDonald [ctb] (Automated tests), Reid Olson [ctb] (Automated tests and bug fixes) |
| Maintainer: | Trent McDonald <[email protected]> |
| License: | GNU General Public License |
| Version: | 4.4.2 |
| Built: | 2026-05-09 21:41:43 UTC |
| Source: | https://github.com/tmcd82070/rdistance |
Rdistance contains functions and associated routines to analyze
distance-sampling data collected on point or line transects.
Some of Rdistance's features include:
Accommodation of both point and line transect analyses in one routine (dfuncEstim()).
Regression-like formula for inclusion of distance function covariates (dfuncEstim()).
Automatic bootstrap confidence intervals (abundEstim()).
Parallel processing of bootstrap iterations (parallel argument of abundEstim()).
Availability of both study-area and site-level abundance estimates (help("predict.dfunc")).
Rigorous physical measurement requirements and automated conversion when necessary (%#%, setUnits()).
Classic parametric distance functions (halfnorm.like(), hazrate.like(), negexp.like()), and
expansion functions (cosine.expansion(), hermite.expansion(), simple.expansion()).
Mixture distance functions for non-standard shapes and thresholds (oneStep.like(), triangle.like(), and huber.like()).
Automated distance function fitting and selection autoDistSamp().
print, plot, predict, coef, and summary methods for distance function objects and
abundance classes.
Distance-sampling is a popular method for abundance estimation in ecology. Line transect surveys are conducted by traversing randomly placed transects in a study area with the objective of sighting animals and estimating density or abundance. Data collected during line transect surveys consists of target sightings, either of individuals or groups, and off-transect distances to the original location of the target. When targets are sighted in groups, data include the number of individuals in the group.
Point transect surveys are similar except that observers stop one or more times along the transect to observe targets. Point transects are popular avian survey methods where detections are often auditory cues. Point transects are also for studies using automated auditory detectors or trail cameras. Point transect data consists of radial distances from the observer to the target.
The defining feature of distance sampling is the tendency for probability of detection to decline as off-transect or radial distances increase. Targets far from the observer are generally harder to detect than those closer. In most line transect studies, targets on the transect (off-transect distance = 0) are assumed to be sighted with 100% probability. This assumption allows researchers to estimate the proportion of missed targets and in turn adjust the number of sighted targets for missed detections. Some studies utilize two observers searching the same areas and are able to estimate the proportion of individuals missed on the transect line and thereby eliminate the assumption that all individuals on the line have been observed.
The author's aims are to provide an easy-to-use, rigorous, and flexible analysis option in R for distance-sampling data. The authors believe that beginning users need easy-to-use and easy-to-understand software, while advanced users require greater flexibility and customization, and their aim is to meet the demands of both groups.
Rdistance contains the following example data sets:
Line-transect sampling of Brewers sparrows in central Wyoming
(sparrowDf()).
Point-transect sampling of Sage Thrashers in central Wyoming
(thrasherDf()).
Buckland, S.T., Anderson, D.R., Burnham, K.P. and Laake, J.L. 1993. Distance Sampling: Estimating Abundance of Biological Populations. Chapman and Hall, London.
Main author and maintainer: Trent McDonald [email protected]
Coauthors: Ryan Nielson, Jason Carlisle, and Aidan McDonald
Contributors: Ben Augustine, James Griswald, Joel Reynolds, Pham Quang, Earl Becker, Aaron Christ, Brook Russelland, Patrick McKann, Lacey Jeroue, Abigail Hoffman, Michael Kleinsasser, and Ried Olson
Useful links:
Report bugs at https://github.com/tmcd82070/Rdistance/issues
Helper functions for assigning physical measurement units in Rdistance.
All are convenience wrappers for units::units::set_units().
x %#% u x %acre% . x %cm% . x %ft% . x %ha% . x %inches% . x %km% . x %km^2% . x %m% . x %m^2% . x %mi% . x %mi^2% . x %yd% . dropUnits(x) setUnits(x, u)x %#% u x %acre% . x %cm% . x %ft% . x %ha% . x %inches% . x %km% . x %km^2% . x %m% . x %m^2% . x %mi% . x %mi^2% . x %yd% . dropUnits(x) setUnits(x, u)
x |
A numeric vector or matrix. |
u |
A string representing physical measurement units to
assign to |
. |
Placeholder for the fixed unit assignment operators. Ignored. See Details. |
The fixed unit assignment operators are designed to behave like
unary operators (i.e., 1 argument);
but, R does not allow
user defined unary operators.
Technically, these fixed unit assignment operators are instances of
R's user-defined infix
operator, and as such they require two arguments.
Their syntax must be
x %<units>% <something>; but, the second argument is ignored
and '.' is suggested. See Examples.
For %#% and setUnits, argument x with
units u attached.
For all the fixed unit assignment operators (i.e., %units%), argument x with the respective units assigned.
For dropUnits, argument x with no units. If
input x has no units, x is returned unchanged.
2 %#% "m" setUnits(2,"km") x <- 2 %#% "km^2" 10 %#% units(x) 2 %#% "km^2" %#% "acres" # Convert km^2 to acres x %#% "acres" # Same x %#% NULL # Drop units dropUnits(x) # Same # %#%'s precedence is below "^" but above "+" and "*" # The following fails: # 2 %#% "m" / (2 %#% "ha") %#% "in/acre" # The following succeeds: (2 %#% "m" / (2 %#% "ha")) %#% "in/acre" 1 %m%. ^ 2 # [m] (1 %m%.) ^ 2 # [m^2] # For fixed unit assignment, 2nd argument does not matter # All of the following are equivalent 2 %m%. 2 %m% x 2 %m% 3 2 %m% NULL 2 %m% NA # Conversion: x <- 10 %#% "ft" x %m%.2 %#% "m" setUnits(2,"km") x <- 2 %#% "km^2" 10 %#% units(x) 2 %#% "km^2" %#% "acres" # Convert km^2 to acres x %#% "acres" # Same x %#% NULL # Drop units dropUnits(x) # Same # %#%'s precedence is below "^" but above "+" and "*" # The following fails: # 2 %#% "m" / (2 %#% "ha") %#% "in/acre" # The following succeeds: (2 %#% "m" / (2 %#% "ha")) %#% "in/acre" 1 %m%. ^ 2 # [m] (1 %m%.) ^ 2 # [m^2] # For fixed unit assignment, 2nd argument does not matter # All of the following are equivalent 2 %m%. 2 %m% x 2 %m% 3 2 %m% NULL 2 %m% NA # Conversion: x <- 10 %#% "ft" x %m%.
Estimate abundance (or density) from an estimated detection function and supplemental information on observed group sizes, transect lengths, area surveyed, etc. Computes confidence intervals on abundance (or density) using a the bias corrected bootstrap method.
abundEstim( object, area = NULL, propUnitSurveyed = 1, ci = 0.95, R = 500, plot.bs = FALSE, showProgress = TRUE, parallel = TRUE )abundEstim( object, area = NULL, propUnitSurveyed = 1, ci = 0.95, R = 500, plot.bs = FALSE, showProgress = TRUE, parallel = TRUE )
object |
A fitted Rdistance distance function (class |
area |
A scalar containing the total area of inference. Usually, this is
study area size. If |
propUnitSurveyed |
A scalar or vector of real numbers between 0 and 1.
The proportion of the default sampling unit that
was surveyed. If both sides of line transects were observed,
|
ci |
A scalar indicating the confidence level of confidence intervals.
Confidence intervals are computed using a bias corrected bootstrap
method. If |
R |
The number of bootstrap iterations to conduct when |
plot.bs |
A logical scalar indicating whether to plot individual
bootstrap iterations.
Ignored unless |
showProgress |
A logical indicating whether to show a text-based
progress bar during bootstrapping. Default is |
parallel |
A logical scalar, or a positive integer; ignored unless
confidence intervals are requested (i.e., |
The abundance estimate for line-transect surveys (if no covariates are included in the detection function and both sides of the transect are observed) is
where n is total number of sighted individuals
(i.e., sum(groupSizes(dfunc))), L is the total length of
surveyed transect (i.e., sum(effort(dfunc))),
and ESW is effective strip width
computed from the estimated distance function (i.e., ESW(dfunc)).
If only one side of transects were observed, the "2" in the denominator
is not present (or, replaced with a "1").
The abundance estimate for point transect surveys (if no covariates are included) is
where n is total number of sighted individuals (i.e., sum(groupSizes(dfunc))),
P is the total number of surveyed points (i.e., sum(effort(dfunc))),
and ESR is effective search radius
computed from the estimated distance function (i.e., ESR(dfunc)).
This routine, abundEstim, estimates abundance on the
entire study area. Site-specific density estimates
are computed by
predict(x, type = "density"), which returns a
tibble containing density and abundance on the area surveyed by every
transect.
An Rdistance 'abundance estimate' object, which is a list of
class c("abund", "dfunc"), containing all the components of a "dfunc"
object (see dfuncEstim()), plus the following:
estimates |
A tibble containing fitted coefficients in the distance function, density in the area(s) surveyed, abundance on the study area, the number of groups seen between w.lo and w.hi, the number of individuals seen between w.lo and w.hi, study area size, surveyed area, average group size, and average effective detection distance. |
B |
If confidence intervals were requested, a tibble
containing all bootstrap values of coefficients,
density, abundance, groups seen, individuals seen,
study area size, surveyed area size, average group size,
and average effective detection distance. The number of rows is always
|
ci |
Confidence level of the confidence intervals |
Rdistance's nested data frames (produced by RdistDf())
contain all information required to estimate bootstrap CIs.
To compute bootstrap CIs, Rdistance resamples, with replacement,
the rows of the $data component contained in Rdistance
fitted models. Rdistance assumes each row of $data
contains information on one transect.
The $data component also contains
information on which observations inform the
detection function, which observations should be counted as
detected targets,
and which transects count toward transect length.
After resampling rows of $data, Rdistance
refits the distance function using non-missing distances,
recomputes the detected number of targets using non-missing
group sizes on transects with non-missing length,
and re-computes total transect length from transects
with non-missing lengths.
By default, R = 500 bootstrap iterations are
performed, after which bias
corrected confidence intervals are computed (Manly, 1997, section 3.4).
The distance function is not re-selected during bootstrap resampling. The model of the input object is re-fitted every iteration.
During bootstrap iterations, the distance function can fail.
An iteration can fail for a two reasons:
(1) no detections on the iteration, and (2) a bad configuration
of distances that push the distance function's parameters to their
limits. When an iteration fails, Rdistance
skips the iteration and effectively ignores the
failed iterations.
If the proportion of failed iterations is small
(less than 20% by default), the resulting abundance confidence interval
is probably valid and no warning is issued. If the proportion of
non-convergent iterations
is not small (exceeds 20% by default), a warning is issued.
The warning can be modified
by re-setting option "Rdistance_maxBSFailPropForWarning" to
the acceptable proportion of failures..
Setting options(Rdistance_masBSFailPropForWarning = 1.0) will turn
suppress the warning.
Setting options(Rdistance_masBSFailPropForWarning = 0.0) will
warn if any iteration failed. Results (density and effective
sampling distance)
from all successful iterations are contained in the
non-NA rows of data frame 'B' in the output object.
Transect lengths can be missing in the RdistDf object. Missing length transects are equivalent to 0 [m] transects and do not count toward total surveyed units, nor do group sizes on these transects count toward total detected individuals. Use NA-length transects to include their associated distances during distance function estimation, but not when estimating abundance. For example, this allows estimation of abundance on one study area using off-transect distances from another. This allows sightability to be estimated using two or more similar targets (e.g., two similar species), but abundance to be estimated separate for each target type. Include NA-length transects by including the "extra" distance observations in the detection data frame, with valid site IDs, but set the length of those site IDs to NA in the site data frame.
Point transects do not have a physical measurement for length. The "length" of point transects is the number of points on the transect. Point transects can contain only one point. Rdistance treats transects of points as independent and bootstrap resamples them to estimate variance. The number of points on each point transect must exist in the RdistDf and cannot have physical measurement units (it is a count, not a distance).
Manly, B.F.J. (1997) Randomization, bootstrap, and Monte-Carlo methods in biology, London: Chapman and Hall.
Buckland, S.T., D.R. Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. (2001) Introduction to distance sampling: estimating abundance of biological populations. Oxford University Press, Oxford, UK.
dfuncEstim(), autoDistSamp(),
predict.dfunc() with 'type = "density"'.
# Load example sparrow data (line transect survey type) # sparrowDf <- RdistDf(sparrowSiteData, sparrowDetectionData) data(sparrowDf) # Fit half-normal detection function dfunc <- sparrowDf |> dfuncEstim(formula=dist ~ groupsize(groupsize) , likelihood="halfnorm" , w.hi=150 %m%. ) # Estimate abundance - Convenient for programming abundDf <- estimateN(dfunc , area = 4105 %km^2%. ) # Same - Nicer output # Set ci=0.95 (or another value) to estimate bootstrap CI's fit <- abundEstim(dfunc , area = 4105 %km^2%. , ci = NULL ) summary(fit)# Load example sparrow data (line transect survey type) # sparrowDf <- RdistDf(sparrowSiteData, sparrowDetectionData) data(sparrowDf) # Fit half-normal detection function dfunc <- sparrowDf |> dfuncEstim(formula=dist ~ groupsize(groupsize) , likelihood="halfnorm" , w.hi=150 %m%. ) # Estimate abundance - Convenient for programming abundDf <- estimateN(dfunc , area = 4105 %km^2%. ) # Same - Nicer output # Set ci=0.95 (or another value) to estimate bootstrap CI's fit <- abundEstim(dfunc , area = 4105 %km^2%. , ci = NULL ) summary(fit)
Computes AICc, AIC, or BIC for estimated distance functions.
## S3 method for class 'dfunc' AIC(object, ..., criterion = "AICc")## S3 method for class 'dfunc' AIC(object, ..., criterion = "AICc")
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
... |
Included for compatibility with generic |
criterion |
String specifying the criterion to compute. Either "AICc", "AIC", or "BIC". |
Regular Akaike's information criterion
(https://en.wikipedia.org/wiki/Akaike_information_criterion) () is
where is the maximized value of the log likelihood
(the minimized value of the negative log
likelihood) and is the
number of coefficients estimated in the detection function. For
dfunc objects, = obj$loglik + 2*length(coef(obj)).
A correction
for small sample size, , is
where is sample
size or number of detected groups for distance analyses. By default, this function
computes . converges quickly to
as increases.
The Bayesian Information Criterion (BIC) is
.
A scalar, the requested fit statistic for object.
Burnham, K. P., and D. R. Anderson, 2002. Model selection and multi-model inference: A practical information-theoretic approach, Second ed. Springer-Verlag. ISBN 0-387-95364-7.
McQuarrie, A. D. R., and Tsai, C.-L., 1998. Regression and time series model selection. World Scientific. ISBN 981023242X
data(sparrowDf) dfunc <- sparrowDf |> dfuncEstim(dist~1) # Fit statistics AIC(dfunc) # AICc AIC(dfunc, criterion="AIC") # AIC AIC(dfunc, criterion="BIC") # BICdata(sparrowDf) dfunc <- sparrowDf |> dfuncEstim(dist~1) # Fit statistics AIC(dfunc) # AICc AIC(dfunc, criterion="AIC") # AIC AIC(dfunc, criterion="BIC") # BIC
Perform automated likelihood, expansion, and series selection for a classic distance sampling analysis. Estimate abundance using the best fitting likelihood, expansion, and series.
autoDistSamp( data, formula, likelihoods = c("halfnorm", "hazrate", "negexp"), w.lo = setUnits(0, "m"), w.hi = NULL, expansions = 0:3, series = c("cosine"), x.scl = w.lo, g.x.scl = 1, warn = TRUE, outputUnits = NULL, area = NULL, propUnitSurveyed = 1, ci = 0.95, R = 500, plot.bs = FALSE, showProgress = TRUE, parallel = TRUE, plot = TRUE, criterion = "AICc" )autoDistSamp( data, formula, likelihoods = c("halfnorm", "hazrate", "negexp"), w.lo = setUnits(0, "m"), w.hi = NULL, expansions = 0:3, series = c("cosine"), x.scl = w.lo, g.x.scl = 1, warn = TRUE, outputUnits = NULL, area = NULL, propUnitSurveyed = 1, ci = 0.95, R = 500, plot.bs = FALSE, showProgress = TRUE, parallel = TRUE, plot = TRUE, criterion = "AICc" )
data |
An |
formula |
A standard formula object. For example, |
likelihoods |
String vector specifying the likelihoods to fit.
See 'likelihood' parameter of |
w.lo |
Lower or left-truncation limit of the distances in distance data.
This is the minimum possible off-transect distance. Default is 0. If
|
w.hi |
Upper or right-truncation limit of the distances
in |
expansions |
A scalar specifying the number of terms
in |
series |
If |
x.scl |
The x coordinate (a distance) at which the
detection function will be scaled. |
g.x.scl |
Height of the distance function at coordinate |
warn |
A logical scalar specifying whether to issue
an R warning if the estimation did not converge or if one
or more parameter estimates are at their boundaries.
For estimation, |
outputUnits |
A string specifying the symbolic measurement
units for results. Valid units are listed in |
area |
A scalar containing the total area of inference. Usually, this is
study area size. If |
propUnitSurveyed |
A scalar or vector of real numbers between 0 and 1.
The proportion of the default sampling unit that
was surveyed. If both sides of line transects were observed,
|
ci |
A scalar indicating the confidence level of confidence intervals.
Confidence intervals are computed using a bias corrected bootstrap
method. If |
R |
The number of bootstrap iterations to conduct when |
plot.bs |
A logical scalar indicating whether to plot individual
bootstrap iterations.
Ignored unless |
showProgress |
A logical indicating whether to show a text-based
progress bar during bootstrapping. Default is |
parallel |
A logical scalar, or a positive integer; ignored unless
confidence intervals are requested (i.e., |
plot |
Logical scalar specifying whether to plot models during model selection.
If |
criterion |
A string specifying the criterion to use when assessing model fit.
The best fitting model, as defined by this routine, has the lowest value
of this criterion. This must be one of "AICc" (the default),
"AIC", or "BIC". See |
During distance function selection, all combinations of likelihoods, series, and
number of expansions is fitted. For example, if likelihoods has 3 elements,
series has 2 elements, and expansions has 4 elements,
this routine fits a total of 3 (likelihoods) * 2 (series) * 4 (expansions)
= 24 models. Default parameters fit 9 detection functions, i.e.,
all combinations of "halfnorm", "hazrate", and "negexp" likelihoods
and 0 through 3 expansions. Other combinations are specified through
values of likelihoods, series, and expansions.
Suppress all intermediate output using plot.bs=FALSE,
showProgress=FALSE, and plot=FALSE.
The returned abundance estimate object contains
an additional component, the fitting table (a list of models fitted and
criterion values) in component $fitTable.
An Rdistance 'abundance estimate' object, which is a list of
class c("abund", "dfunc"), containing all the components of a "dfunc"
object (see dfuncEstim()), plus the following:
estimates |
A tibble containing fitted coefficients in the distance function, density in the area(s) surveyed, abundance on the study area, the number of groups seen between w.lo and w.hi, the number of individuals seen between w.lo and w.hi, study area size, surveyed area, average group size, and average effective detection distance. |
B |
If confidence intervals were requested, a tibble
containing all bootstrap values of coefficients,
density, abundance, groups seen, individuals seen,
study area size, surveyed area size, average group size,
and average effective detection distance. The number of rows is always
|
ci |
Confidence level of the confidence intervals |
# Load example sparrow data (line transect survey type) data(sparrowDf) autoDistSamp(data = sparrowDf , formula = dist ~ groupsize(groupsize) , likelihoods = c("halfnorm","negexp") , expansions = 0 , plot = FALSE , ci = NULL , area = 1 %ha%. ) ## Not run: autoDistSamp(data = sparrowDf , formula = dist ~ 1 + groupsize(groupsize) , ci = 0.95 , area = 1 %ha%. ) ## End(Not run)# Load example sparrow data (line transect survey type) data(sparrowDf) autoDistSamp(data = sparrowDf , formula = dist ~ groupsize(groupsize) , likelihoods = c("halfnorm","negexp") , expansions = 0 , plot = FALSE , ci = NULL , area = 1 %ha%. ) ## Not run: autoDistSamp(data = sparrowDf , formula = dist ~ 1 + groupsize(groupsize) , ci = 0.95 , area = 1 %ha%. ) ## End(Not run)
Calculate bias-corrected confidence intervals for bootstrap data using methods in Manly textbook.
bcCI(x.bs, x, ci = 0.95)bcCI(x.bs, x, ci = 0.95)
x.bs |
A vector of bootstrap estimates of some quantity. |
x |
A scalar of the original estimate of the quantity. |
ci |
A scalar of the desired confidence interval coverage. |
A named vector containing the lower and upper endpoints of the bias-corrected bootstrap confidence interval.
Performs bootstrap resampling iterations, either in parallel across CPU cores or in serial on a single core.
bootstrap( object, area, propUnitSurveyed, R, plot.bs, plotCovValues, showProgress, parallel, cores )bootstrap( object, area, propUnitSurveyed, R, plot.bs, plotCovValues, showProgress, parallel, cores )
object |
A fitted Rdistance distance function (class |
area |
A scalar containing the total area of inference. Usually, this is
study area size. If |
propUnitSurveyed |
A scalar or vector of real numbers between 0 and 1.
The proportion of the default sampling unit that
was surveyed. If both sides of line transects were observed,
|
R |
The number of bootstrap iterations to conduct when |
plot.bs |
A logical scalar indicating whether to plot individual
bootstrap iterations.
Ignored unless |
plotCovValues |
Data frame containing values of covariates to plot.
Ignored if |
showProgress |
A logical indicating whether to show a text-based
progress bar during bootstrapping. Default is |
parallel |
Logical scalar. TRUE if we are running iterations
in parallel across CPU cores. Number of cores specified in |
cores |
Integer scalar. The number of CPU cores to use during
parallel operations, if requested. Ignored if |
A data frame containing density, abundance,
and other relevant statistics for every bootstrap iteration.
Number of rows is R. If the model from one iteration failed
for any reason (e.g., non-convergence), the entire row except the ID column
is missing.
Computes spline, specifically b-spline, expansion terms that modify the shape of distance likelihood functions.
bspline.expansion(x, expansions)bspline.expansion(x, expansions)
x |
A numeric matrix of distances at which to evaluate
the expansion series. For distance analysis, |
expansions |
A scalar specifying the number of expansion terms to compute. Must be one of the integers 1, 2, 3, 4, or 5. |
A 3D array of size nrow(x) X ncol(x) X expansions.
The 'pages' (3rd dimension) of this array are the cosine expansions of
x. i.e., page 1 is the first expansion term of x,
page 2 is the second expansion term of x, etc.
dfuncEstim(),
cosine.expansion(),
hermite.expansion(),
simple.expansion().
x <- matrix(seq(0, 1, length = 200), ncol = 1) spl.expn <- bspline.expansion(x, 5) plot(range(x), range(spl.expn), type="n") matlines(x, spl.expn[,1,1:5], col=rainbow(5), lty = 1)x <- matrix(seq(0, 1, length = 200), ncol = 1) spl.expn <- bspline.expansion(x, 5) plot(range(x), range(spl.expn), type="n") matlines(x, spl.expn[,1,1:5], col=rainbow(5), lty = 1)
Check that number of integration intervals is odd and sufficiently large. Plus, compute and store the Simpson's Composite rule coefficients.
checkNEvalPts(nEvalPts)checkNEvalPts(nEvalPts)
nEvalPts |
An integer to check. |
The first element of nEvalPts is returned if it is acceptable. If nEvalPts is not acceptable, an error is thrown. Simpson's composite coefficients are store in options()
Check for the presence of physical measurement units on key
columns of
an RdistDf data frame.
checkUnits(ml)checkUnits(ml)
ml |
An |
The input ml list, with units of various
quantities converted to common units. If a check fails, for example,
a quantity does not have units, an error is thrown.
Extract distance model coefficients from an estimated detection function object.
## S3 method for class 'dfunc' coef(object, ...)## S3 method for class 'dfunc' coef(object, ...)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
... |
Ignored |
The estimated coefficient vector for the detection function. Length and interpretation of values vary depending on the form of the detection function and expansion terms.
data(sparrowDfuncObserver) # pre-estimated dfunc # Same as sparrowDfuncObserver$par coef(sparrowDfuncObserver) ## Not run: data(sparrowDf) dfunc <- sparrowDf |> dfuncEstim(dist~bare + observer, w.hi = 150 %m%.) coef(dfunc) ## End(Not run)data(sparrowDfuncObserver) # pre-estimated dfunc # Same as sparrowDfuncObserver$par coef(sparrowDfuncObserver) ## Not run: data(sparrowDf) dfunc <- sparrowDf |> dfuncEstim(dist~bare + observer, w.hi = 150 %m%.) coef(dfunc) ## End(Not run)
Add ANSI color to a string using the
crayon package, if the R environment accepts color.
This function is needed because some
output cannot be colorized. Color determination
is made by crayon::has_color().
Rdistance results often include units, e.g., "25 [m]".
Only colorize the numbers, not the units. Everything between the
first set of square brackets ([...]) is NOT colorized.
Subsequent sets of brackets (e.g., "25 [m] + 30 [ft]") ARE colorized
(i.e., "[ft]" is color).
colorize(STR, col = NULL, bg = NULL)colorize(STR, col = NULL, bg = NULL)
STR |
A vector of strings to colorize. |
col |
A string specifying the desired foreground color.
This is passed straight to |
bg |
A string specifying the desired background color. Must be one of "bgBlack", "bgRed", "bgGreen", "bgYellow", "bgBlue" "bgMagenta", "bgCyan", or "bgWhite". By default, no background is applied. |
If color is not allowed in the terminal, the input
string is returned unperturbed. If color is allowed, the input
string is returned with color and background ANSI code surrounding
the initial part of the string from character 1 to the character
before the [ in the first pair of [].
crayon::style
Computes the cosine expansion terms that modify the shape of distance likelihood functions.
cosine.expansion(x, expansions)cosine.expansion(x, expansions)
x |
A numeric matrix of distances at which to evaluate
the expansion series. For distance analysis, |
expansions |
A scalar specifying the number of expansion terms to compute. Must be one of the integers 1, 2, 3, 4, or 5. |
The cosine expansion used here is:
First term:
Second term:
Third term:
Fourth term:
Fifth term:
The maximum number of expansion terms is 5.
The cosine expansion frequency in Rdistance is 2*pi. Each term is two pi more than the previous. The sine expansion frequency in Rdistance is pi. Consequently, the sine and cosine expansions fit different models.
A 3D array of size nrow(x) X ncol(x) X expansions.
The 'pages' (3rd dimension) of this array are the cosine expansions of
x. i.e., page 1 is the first expansion term of x,
page 2 is the second expansion term of x, etc.
dfuncEstim(),
hermite.expansion(),
simple.expansion()
x <- matrix(seq(0, 1, length = 200), ncol = 1) cos.expn <- cosine.expansion(x, 5) plot(range(x), range(cos.expn), type="n") matlines(x, cos.expn[,1,1:5], col=rainbow(5), lty = 1)x <- matrix(seq(0, 1, length = 200), ncol = 1) cos.expn <- cosine.expansion(x, 5) plot(range(x), range(cos.expn), type="n") matlines(x, cos.expn[,1,1:5], col=rainbow(5), lty = 1)
Fits a detection function to off-transect distances collected by multiple observers.
dE.multi( data, formula, likelihood = "halfnorm", w.lo = setUnits(0, "m"), w.hi = NULL, expansions = 0, series = "cosine", x.scl = setUnits(0, "m"), g.x.scl = 1, warn = TRUE, outputUnits = NULL )dE.multi( data, formula, likelihood = "halfnorm", w.lo = setUnits(0, "m"), w.hi = NULL, expansions = 0, series = "cosine", x.scl = setUnits(0, "m"), g.x.scl = 1, warn = TRUE, outputUnits = NULL )
data |
An |
formula |
A standard formula object. For example, |
likelihood |
String specifying the likelihood to fit. Built-in likelihoods at present are "halfnorm", "hazrate", and "negexp". |
w.lo |
Lower or left-truncation limit of the distances in distance data.
This is the minimum possible off-transect distance. Default is 0. If
|
w.hi |
Upper or right-truncation limit of the distances
in |
expansions |
A scalar specifying the number of terms
in |
series |
If |
x.scl |
The x coordinate (a distance) at which the
detection function will be scaled. |
g.x.scl |
Height of the distance function at coordinate |
warn |
A logical scalar specifying whether to issue
an R warning if the estimation did not converge or if one
or more parameter estimates are at their boundaries.
For estimation, |
outputUnits |
A string specifying the symbolic measurement
units for results. Valid units are listed in |
An object of class 'dfunc' with the following components:
par: The vector of estimated parameter values.
Length of this vector is the sum of the following:
The number of columns of the design matrix. This equals the number of covariates in the distance function plus one for the intercept, assuming an intercept is included.
The number of constant parameters in the distance function. Constant parameters are those not related to covariates. For example, the exponent 'k' parameter for hazard rate likelihood, or the mixing fraction 'p' for the oneStep likelihood. This can be zero.
The number of expansion functions called for. This equals
the input expansions.
loglik: The maximized value of the log likelihood.
convergence: The convergence code. This code
is returned by the optimizing routine (e.g., optim or nlminb).
Values other than 0 indicate suspect convergence.
message: If maximization did not converge (convergence != 0),
this is the reason given by the optimizing routine.
varcovar: The variance-covariance matrix for coefficients
of the distance function, either estimated by the inverse of
the fit's Hessian or by bootstrapping.
If the likelihood is smooth (i.e., those listed by
Rdistance:::differentiableLikelihoods()),
Rdistance initially estimates the variance-covariance matrix using the
second derivative of the log likelihood surface
at the final estimates, where second derivatives are estimated by
numeric differentiation (in routine secondDeriv().
The variance-covariance matrix is re-set to NULL
if the Hessian is not positive-definite. If bootstrap resampling
has been performed (using abundEstim()), the variance-covariance
matrix is re-estimated using the bootstrap values of parameters
and automatically reset.
Error estimates derived from bootstrapping are generally
preferable to the asymptotic estimates, hence the automatic
re-set.
limits: A list containing the lower and upper limits of parameters.
evaluations: The number of likelihood evaluations performed by the
optimizer.
mf: An R 'model frame' containing the detections (within the strip
or circle) used in the fit, covariates specified in the formula,
and groupsizes. Column 'dist' contains the
observed distances. The intercept, if included in the model, is not
included as a column in this model frame. (Test whether an intercept
is included using attr(terms(return$mf), "intercept")).
Column offset(...) contains group sizes associated with
the values of dist. Name of the group size column is "offset(...)",
not "groupsize(...)", so that group sizes can be treated offsets in
other R routines. The mf component is a proper model.frame and contains
both terms and contrasts attributes. This model frame
contains only non-missing distances between w.lo and w.hi.
data: The original nested data frame subset to information required
to complete distance estimation. This data frame contains information
on replication (i.e., rows are sites and are re-sampled during bootstrapping),
missing distances, missing transect lengths, and distances outside the observation
strip (below w.lo or above w.hi).
formula: The distance function's formula.
dataName: Name of the original nested data frame.
likelihood: The name of the likelihood fitted to observation
distances.
w.lo: Left-truncation value used during the fit.
w.hi: Right-truncation value used during the fit.
expansions: The number of expansion terms.
series: The type of expansion used during estimation. This is
only relevant if expansions > 0.
x.scl: The distance at which the function has been scaled to some value.
This is the x at which g(x) = g.x.scl.
g.x.scl: The height of the distance function at distance x.scl.
outputUnits: A list of type symbolic_units containing the
physical measurement units used during estimation.
asymptoticSE: A logical scalar indication whether the
variance-covariance matrix in component varcovar is
asymptotic (TRUE; estimated from the Hessian) or bootstrap (FALSE;
estimated by bootstrap resampling).
optimizer: The optimizing routine used.
call: The original function call.
nCovars: The number of exogenous covariates fitted in the
distance function. Does not include the intercept.
LhoodType: The type of likelihood fitted. Currently, only 'parametric'
types are fitted.
Fits a detection function to off-transect distances collected by a single observer.
dE.single( data, formula, likelihood = "halfnorm", w.lo = setUnits(0, "m"), w.hi = NULL, expansions = 0, series = "cosine", x.scl = w.lo, g.x.scl = 1, warn = TRUE, outputUnits = NULL, asymptoticSE = TRUE )dE.single( data, formula, likelihood = "halfnorm", w.lo = setUnits(0, "m"), w.hi = NULL, expansions = 0, series = "cosine", x.scl = w.lo, g.x.scl = 1, warn = TRUE, outputUnits = NULL, asymptoticSE = TRUE )
data |
An |
formula |
A standard formula object. For example, |
likelihood |
String specifying the likelihood to fit. Built-in likelihoods at present are "halfnorm", "hazrate", and "negexp". |
w.lo |
Lower or left-truncation limit of the distances in distance data.
This is the minimum possible off-transect distance. Default is 0. If
|
w.hi |
Upper or right-truncation limit of the distances
in |
expansions |
A scalar specifying the number of terms
in |
series |
If |
x.scl |
The x coordinate (a distance) at which the
detection function will be scaled. |
g.x.scl |
Height of the distance function at coordinate |
warn |
A logical scalar specifying whether to issue
an R warning if the estimation did not converge or if one
or more parameter estimates are at their boundaries.
For estimation, |
outputUnits |
A string specifying the symbolic measurement
units for results. Valid units are listed in |
asymptoticSE |
Logical variable for whether to calculate
asymptotic standard errors. The default (TRUE) estimates an
asymptotic variance-covariance matrix for parameters based on the
likelihood's Hessian (2nd derivative). If maximization
has been performed by Nlminb or HookesJeeves, the asymptotic
Hessian is estimated using numeric second derivatives
of the likelihood at the maximum likelihood solution. If
maximization was performed by Optim, the last Hessian of
the maximization is returned
by Optim and used
(see |
Optimization and estimation controls can be modified using options().
See RdistanceControls().
An object of class 'dfunc' with the following components:
par: The vector of estimated parameter values.
Length of this vector is the sum of the following:
The number of columns of the design matrix. This equals the number of covariates in the distance function plus one for the intercept, assuming an intercept is included.
The number of constant parameters in the distance function. Constant parameters are those not related to covariates. For example, the exponent 'k' parameter for hazard rate likelihood, or the mixing fraction 'p' for the oneStep likelihood. This can be zero.
The number of expansion functions called for. This equals
the input expansions.
loglik: The maximized value of the log likelihood.
convergence: The convergence code. This code
is returned by the optimizing routine (e.g., optim or nlminb).
Values other than 0 indicate suspect convergence.
message: If maximization did not converge (convergence != 0),
this is the reason given by the optimizing routine.
varcovar: The variance-covariance matrix for coefficients
of the distance function, either estimated by the inverse of
the fit's Hessian or by bootstrapping.
If the likelihood is smooth (i.e., those listed by
Rdistance:::differentiableLikelihoods()),
Rdistance initially estimates the variance-covariance matrix using the
second derivative of the log likelihood surface
at the final estimates, where second derivatives are estimated by
numeric differentiation (in routine secondDeriv().
The variance-covariance matrix is re-set to NULL
if the Hessian is not positive-definite. If bootstrap resampling
has been performed (using abundEstim()), the variance-covariance
matrix is re-estimated using the bootstrap values of parameters
and automatically reset.
Error estimates derived from bootstrapping are generally
preferable to the asymptotic estimates, hence the automatic
re-set.
limits: A list containing the lower and upper limits of parameters.
evaluations: The number of likelihood evaluations performed by the
optimizer.
mf: An R 'model frame' containing the detections (within the strip
or circle) used in the fit, covariates specified in the formula,
and groupsizes. Column 'dist' contains the
observed distances. The intercept, if included in the model, is not
included as a column in this model frame. (Test whether an intercept
is included using attr(terms(return$mf), "intercept")).
Column offset(...) contains group sizes associated with
the values of dist. Name of the group size column is "offset(...)",
not "groupsize(...)", so that group sizes can be treated offsets in
other R routines. The mf component is a proper model.frame and contains
both terms and contrasts attributes. This model frame
contains only non-missing distances between w.lo and w.hi.
data: The original nested data frame subset to information required
to complete distance estimation. This data frame contains information
on replication (i.e., rows are sites and are re-sampled during bootstrapping),
missing distances, missing transect lengths, and distances outside the observation
strip (below w.lo or above w.hi).
formula: The distance function's formula.
dataName: Name of the original nested data frame.
likelihood: The name of the likelihood fitted to observation
distances.
w.lo: Left-truncation value used during the fit.
w.hi: Right-truncation value used during the fit.
expansions: The number of expansion terms.
series: The type of expansion used during estimation. This is
only relevant if expansions > 0.
x.scl: The distance at which the function has been scaled to some value.
This is the x at which g(x) = g.x.scl.
g.x.scl: The height of the distance function at distance x.scl.
outputUnits: A list of type symbolic_units containing the
physical measurement units used during estimation.
asymptoticSE: A logical scalar indication whether the
variance-covariance matrix in component varcovar is
asymptotic (TRUE; estimated from the Hessian) or bootstrap (FALSE;
estimated by bootstrap resampling).
optimizer: The optimizing routine used.
call: The original function call.
nCovars: The number of exogenous covariates fitted in the
distance function. Does not include the intercept.
LhoodType: The type of likelihood fitted. Currently, only 'parametric'
types are fitted.
To specify non-unity group sizes, use groupsize()
on the RHS of formula. When group sizes are not all 1, they must appear in a column
of the 'detections' list-column of data.
For example, d ~ habitat + groupsize(number) specifies
distances in column d, one covariate
named habitat, and that column number
contains the number of individuals
associated with each detection. If group sizes are not specified,
all group sizes are assumed to be 1.
Factor contrasts in Rdistance are specified
the same way as in lm or glm.
By default, Rdistance uses
contrasts in getOption("contrasts"). To change contrasts, use a statement
like options(contrasts = c(unordered = "contr.SAS", ordered = "contr.poly")). Or, to set contrasts for a
specific factor in the input data frame, use
contrasts(df$A) <- "contr.sum" or similar.
See contrasts() or the contrasts.arg
of model.matrix().
Rdistance accommodates two kinds of transects: continuous and point.
Detections can occur at any point on continuous transects.
Rdistance calls these 'line-transects' even though routes are not
necessarily a straight line.
On point transects, detections occur at a series of stops
(points). Rdisance calls these point-transects. Transects are the basic
sampling unit in both cases. Rdistance assumes each row of data
contains information from one transect. See RdistDf() for
more details.
As of Rdistance version 3.0.0, measurement units are
require on all physical distances.
Requiring units ensures that internal calculations and results
(e.g., ESW and abundance) are correct
and that output units are clear.
Physical distances are required on
off-transect distances, radial distances, truncation distances
(w.lo, unless it is zero; and w.hi, unless it is NULL),
scale locations (x.scl, unless it is zero),
line-transect lengths, and study area size. All units are
1-dimensional except those on study area, which are 2-dimensional.
Physical measurement units can vary. For example,
off-transect distances can be meters ("m"), w.hi can be inches ("in"),
and w.lo can be kilometers ("km"). Internally, all distances are
converted to the units specified by outputUnits
(or the units of input distances if
outputUnits is NULL), and
all output is reported
in units of outputUnits. Valid conversions must exist between
units or an error is thrown (e.g., meters cannot convert
into hectares).
Measurement units can be assigned using one of Rdistance's
unit helper routines (see help(unitHelpers)), Rdistance's
setUnits() function, or units::set_units()
See units::valid_udunits()
for a list of valid symbolic units.
If measurements are truly unit-less, or measurement units are unknown,
set options(Rdist_requireUnits = FALSE). This suppresses
all unit checks and conversions. Users are on their own here
and must make sure all inputs are scaled correctly so that internal
computations are correct and output units are known.
Buckland, S.T., D.R. Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. (2001) Introduction to distance sampling: estimating abundance of biological populations. Oxford University Press, Oxford, UK.
abundEstim(), autoDistSamp().
Likelihood-specific help files (e.g., halfnorm.like()).
# Load example sparrow data (line transect survey type) data(sparrowDf) dfunc <- dfuncEstim(data = sparrowDf , formula = dist ~ 1) dfunc plot(dfunc)# Load example sparrow data (line transect survey type) data(sparrowDf) dfunc <- dfuncEstim(data = sparrowDf , formula = dist ~ 1) dfunc plot(dfunc)
Fits a detection function using maximum likelihood.
dfuncEstim(data, ...)dfuncEstim(data, ...)
data |
An |
... |
Arguments passed on to
|
Optimization and estimation controls can be modified using options().
See RdistanceControls().
An object of class 'dfunc' with the following components:
par: The vector of estimated parameter values.
Length of this vector is the sum of the following:
The number of columns of the design matrix. This equals the number of covariates in the distance function plus one for the intercept, assuming an intercept is included.
The number of constant parameters in the distance function. Constant parameters are those not related to covariates. For example, the exponent 'k' parameter for hazard rate likelihood, or the mixing fraction 'p' for the oneStep likelihood. This can be zero.
The number of expansion functions called for. This equals
the input expansions.
loglik: The maximized value of the log likelihood.
convergence: The convergence code. This code
is returned by the optimizing routine (e.g., optim or nlminb).
Values other than 0 indicate suspect convergence.
message: If maximization did not converge (convergence != 0),
this is the reason given by the optimizing routine.
varcovar: The variance-covariance matrix for coefficients
of the distance function, either estimated by the inverse of
the fit's Hessian or by bootstrapping.
If the likelihood is smooth (i.e., those listed by
Rdistance:::differentiableLikelihoods()),
Rdistance initially estimates the variance-covariance matrix using the
second derivative of the log likelihood surface
at the final estimates, where second derivatives are estimated by
numeric differentiation (in routine secondDeriv().
The variance-covariance matrix is re-set to NULL
if the Hessian is not positive-definite. If bootstrap resampling
has been performed (using abundEstim()), the variance-covariance
matrix is re-estimated using the bootstrap values of parameters
and automatically reset.
Error estimates derived from bootstrapping are generally
preferable to the asymptotic estimates, hence the automatic
re-set.
limits: A list containing the lower and upper limits of parameters.
evaluations: The number of likelihood evaluations performed by the
optimizer.
mf: An R 'model frame' containing the detections (within the strip
or circle) used in the fit, covariates specified in the formula,
and groupsizes. Column 'dist' contains the
observed distances. The intercept, if included in the model, is not
included as a column in this model frame. (Test whether an intercept
is included using attr(terms(return$mf), "intercept")).
Column offset(...) contains group sizes associated with
the values of dist. Name of the group size column is "offset(...)",
not "groupsize(...)", so that group sizes can be treated offsets in
other R routines. The mf component is a proper model.frame and contains
both terms and contrasts attributes. This model frame
contains only non-missing distances between w.lo and w.hi.
data: The original nested data frame subset to information required
to complete distance estimation. This data frame contains information
on replication (i.e., rows are sites and are re-sampled during bootstrapping),
missing distances, missing transect lengths, and distances outside the observation
strip (below w.lo or above w.hi).
formula: The distance function's formula.
dataName: Name of the original nested data frame.
likelihood: The name of the likelihood fitted to observation
distances.
w.lo: Left-truncation value used during the fit.
w.hi: Right-truncation value used during the fit.
expansions: The number of expansion terms.
series: The type of expansion used during estimation. This is
only relevant if expansions > 0.
x.scl: The distance at which the function has been scaled to some value.
This is the x at which g(x) = g.x.scl.
g.x.scl: The height of the distance function at distance x.scl.
outputUnits: A list of type symbolic_units containing the
physical measurement units used during estimation.
asymptoticSE: A logical scalar indication whether the
variance-covariance matrix in component varcovar is
asymptotic (TRUE; estimated from the Hessian) or bootstrap (FALSE;
estimated by bootstrap resampling).
optimizer: The optimizing routine used.
call: The original function call.
nCovars: The number of exogenous covariates fitted in the
distance function. Does not include the intercept.
LhoodType: The type of likelihood fitted. Currently, only 'parametric'
types are fitted.
To specify non-unity group sizes, use groupsize()
on the RHS of formula. When group sizes are not all 1, they must appear in a column
of the 'detections' list-column of data.
For example, d ~ habitat + groupsize(number) specifies
distances in column d, one covariate
named habitat, and that column number
contains the number of individuals
associated with each detection. If group sizes are not specified,
all group sizes are assumed to be 1.
Factor contrasts in Rdistance are specified
the same way as in lm or glm.
By default, Rdistance uses
contrasts in getOption("contrasts"). To change contrasts, use a statement
like options(contrasts = c(unordered = "contr.SAS", ordered = "contr.poly")). Or, to set contrasts for a
specific factor in the input data frame, use
contrasts(df$A) <- "contr.sum" or similar.
See contrasts() or the contrasts.arg
of model.matrix().
As of Rdistance version 3.0.0, measurement units are
require on all physical distances.
Requiring units ensures that internal calculations and results
(e.g., ESW and abundance) are correct
and that output units are clear.
Physical distances are required on
off-transect distances, radial distances, truncation distances
(w.lo, unless it is zero; and w.hi, unless it is NULL),
scale locations (x.scl, unless it is zero),
line-transect lengths, and study area size. All units are
1-dimensional except those on study area, which are 2-dimensional.
Physical measurement units can vary. For example,
off-transect distances can be meters ("m"), w.hi can be inches ("in"),
and w.lo can be kilometers ("km"). Internally, all distances are
converted to the units specified by outputUnits
(or the units of input distances if
outputUnits is NULL), and
all output is reported
in units of outputUnits. Valid conversions must exist between
units or an error is thrown (e.g., meters cannot convert
into hectares).
Measurement units can be assigned using one of Rdistance's
unit helper routines (see help(unitHelpers)), Rdistance's
setUnits() function, or units::set_units()
See units::valid_udunits()
for a list of valid symbolic units.
If measurements are truly unit-less, or measurement units are unknown,
set options(Rdist_requireUnits = FALSE). This suppresses
all unit checks and conversions. Users are on their own here
and must make sure all inputs are scaled correctly so that internal
computations are correct and output units are known.
Buckland, S.T., D.R. Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. (2001) Introduction to distance sampling: estimating abundance of biological populations. Oxford University Press, Oxford, UK.
abundEstim(), autoDistSamp().
Likelihood-specific help files (e.g., halfnorm.like()).
# Sparrow line transect example data(sparrowDetectionData) data(sparrowSiteData) sparrowDf <- RdistDf(sparrowSiteData, sparrowDetectionData) dfunc <- dfuncEstim(sparrowDf, formula = dist ~ 1 ) summary(dfunc) data(sparrowDfuncObserver) # pre-estimated object ## Not run: # Commands that produced 'sparrowDfuncObserver' sparrowDfuncObserver <- sparrowDf |> dfuncEstim( formula = dist ~ observer ) ## End(Not run) sparrowDfuncObserver summary(sparrowDfuncObserver) plot(sparrowDfuncObserver) plot(sparrowDfuncObserver , newdata = data.frame(observer = c("obs1", "obs2", "obs3" , "obs4", "obs5")))# Sparrow line transect example data(sparrowDetectionData) data(sparrowSiteData) sparrowDf <- RdistDf(sparrowSiteData, sparrowDetectionData) dfunc <- dfuncEstim(sparrowDf, formula = dist ~ 1 ) summary(dfunc) data(sparrowDfuncObserver) # pre-estimated object ## Not run: # Commands that produced 'sparrowDfuncObserver' sparrowDfuncObserver <- sparrowDf |> dfuncEstim( formula = dist ~ observer ) ## End(Not run) sparrowDfuncObserver summary(sparrowDfuncObserver) plot(sparrowDfuncObserver) plot(sparrowDfuncObserver , newdata = data.frame(observer = c("obs1", "obs2", "obs3" , "obs4", "obs5")))
Utility function to produce error messages suitable for stop
dfuncEstimErrMessage(txt, attri)dfuncEstimErrMessage(txt, attri)
txt |
A text string describing the error. |
attri |
An attribute to report. |
A string
Return a character vector of
likelihoods which are differentiable and hence
the second derivative method can be used to estimate
variance-covariance. Any likelihoods not in this list
must use bootstrapping. This vector is polled in
a few Rdistance routines, notably parseModel
and varcovarEstim.
differentiableLikelihoods()differentiableLikelihoods()
A character vector of differentiable likelihoods.
Extract the observation distances (i.e., responses for an Rdistance model) from an Rdistance model frame.
distances(ml, na.rm = TRUE, ...)distances(ml, na.rm = TRUE, ...)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
na.rm |
Whether to include or exclude missing distance values.
In |
... |
Ignored |
A vector containing observation distances contained in the Rdistance model frame.
data(sparrowDf) sparrowModel <- parseModel( sparrowDf, dist ~ observer ) stats::model.response(sparrowModel$mf) distances(sparrowModel) # same, but future-proofdata(sparrowDf) sparrowModel <- parseModel( sparrowDf, dist ~ observer ) stats::model.response(sparrowModel$mf) distances(sparrowModel) # same, but future-proof
Computes Effective Detection Radius (EDR) for estimated
detection functions on point transects.
See ESW() is for line transects.
EDR(object, newdata = NULL)EDR(object, newdata = NULL)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
newdata |
A data frame containing new values of
the covariates at which to evaluate the distance functions.
If |
Effective Detection Radius is the integral under the detection function times distance.
If newdata is present, the returned value is
a vector of effective sampling distances associated with
covariate values in newdata. Length of return
in this case is the number of rows in newdata.
If newdata is NULL, the returned value is a vector
of effective sampling distances associated with covariate
values in object. Length of return in this case
is the number of detected groups. The returned vector
has measurement units, i.e., object$outputUnits.
dfuncEstim(), ESW(),
effectiveDistance()
# Load example thrasher data (point transect survey type) data(thrasherDf) # Fit half-normal detection function dfunc <- thrasherDf |> dfuncEstim(formula=dist~bare) # Compute effective detection radius (EDR) EDR(dfunc) # vector length 192 effectiveDistance(dfunc) # same EDR(dfunc, newdata = data.frame(bare=30)) # vector length 1# Load example thrasher data (point transect survey type) data(thrasherDf) # Fit half-normal detection function dfunc <- thrasherDf |> dfuncEstim(formula=dist~bare) # Compute effective detection radius (EDR) EDR(dfunc) # vector length 192 effectiveDistance(dfunc) # same EDR(dfunc, newdata = data.frame(bare=30)) # vector length 1
Computes Effective Strip Width (ESW) for line-transect detection functions, or the analogous Effective Detection Radius (EDR) for point-transect detection functions.
effectiveDistance(object, newdata = NULL)effectiveDistance(object, newdata = NULL)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
newdata |
A data frame containing new values for
covariates at which either
ESW's or EDR's will be computed. If NULL and
|
Serves as a wrapper for
ESW() and EDR().
Effective distances are areas under scaled distance functions (i.e., area under g(x)). Areas are exact for functions whose integral is known (e.g., negexp). Numeric integration is used for all others.
If newdata is present, the returned value is
a vector of effective sampling distances associated with
covariate values in newdata. Length of return
in this case is the number of rows in newdata.
If newdata is NULL, the returned value is a vector
of effective sampling distances associated with covariate
values in object. Length of return in this case
is the number of detected groups. The returned vector
has measurement units, i.e., object$outputUnits.
dfuncEstim(),
ESW(),
EDR(),
integrateNumeric(),
integrateNegexpLines()
Extract effort information from an Rdistance data frame. Effort is length of line-transects or number of points on point-transects.
effort(x, ...)effort(x, ...)
x |
Either an estimated distance function, output by
|
... |
Ignored |
A vector containing effort. If line-transects, return is length of transects, with units. If point-transects, return is number of points (integers, no units). Vector length is number of transects. If input is not an RdistDf or estimated distance function, return is NULL.
data(sparrowDf) effort(sparrowDf) fit <- dfuncEstim(sparrowDf, dist ~ 1) effort(fit)data(sparrowDf) effort(sparrowDf) fit <- dfuncEstim(sparrowDf, dist ~ 1) effort(fit)
Constructs a string stating what is "unknown" that is suitable for use in warning and error functions.
errDataUnk(txt, attri)errDataUnk(txt, attri)
txt |
Text. The "unknown" we are looking for. |
attri |
Attribute description we are looking for. |
A descriptive string, suitable for warning or error.
Estimate abundance from an Rdistance fitted model.
This function is called internally by abundEstim. Most users will call
abundEstim to estimate abundance unless they are running simulations or
bootstrapping.
estimateN(object, area = NULL, propUnitSurveyed = 1)estimateN(object, area = NULL, propUnitSurveyed = 1)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
area |
A scalar containing the total area of inference. Usually, this is
study area size. If |
propUnitSurveyed |
A scalar or vector of real numbers between 0 and 1.
The proportion of the default sampling unit that
was surveyed. If both sides of line transects were observed,
|
The abundance estimate for line-transect surveys (if no covariates are included in the detection function and both sides of the transect are observed) is
where n is total number of sighted individuals
(i.e., sum(groupSizes(dfunc))), L is the total length of
surveyed transect (i.e., sum(effort(dfunc))),
and ESW is effective strip width
computed from the estimated distance function (i.e., ESW(dfunc)).
If only one side of transects were observed, the "2" in the denominator
is not present (or, replaced with a "1").
The abundance estimate for point transect surveys (if no covariates are included) is
where n is total number of sighted individuals (i.e., sum(groupSizes(dfunc))),
P is the total number of surveyed points (i.e., sum(effort(dfunc))),
and ESR is effective search radius
computed from the estimated distance function (i.e., ESR(dfunc)).
This routine, abundEstim, estimates abundance on the
entire study area. Site-specific density estimates
are computed by
predict(x, type = "density"), which returns a
tibble containing density and abundance on the area surveyed by every
transect.
A list containing the following components:
density |
Estimated density in the surveyed area. |
abundance |
Estimated abundance on the study area. Equals density if area is not specified. |
n.groups |
The number of detected groups (not individuals, unless all group sizes = 1). |
n.seen |
The number of individuals (sum of group sizes). |
area |
Total area of inference. Study area size |
surveyedUnits |
Number of surveyed sites. This is total transect length for line-transects or number of points for point-transects. This total transect length does not include transects with missing lengths. |
propUnitSurveyed |
Proportion of the standard survey unit that was observed |
avg.group.size |
Average group size on non-NA transects |
w |
Strip width. |
pDetection |
Probability of detection. |
For line-transects that do not involve covariates, object$density is object$n.seen / (2 * propUnitSurveyed * object$w * object$pDetection * object$surveyedUnits)
Returns effective strip width (ESW) for
line-transect detection functions.
See EDR() is for point transects.
ESW(object, newdata = NULL)ESW(object, newdata = NULL)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
newdata |
A data frame containing new values for
covariates at which either
ESW's or EDR's will be computed. If NULL and
|
ESW is area under
a scaled distance function between its
left-truncation limit (obj$w.lo) and its right-truncation
limit (obj$w.hi).
If detection does not decline with distance,
the detection function is flat (horizontal), and
area under the detection
function is .
If, in this case, , effective sampling distance is
the half-width of the surveys,
If newdata is present, the returned value is
a vector of effective sampling distances associated with
covariate values in newdata. Length of return
in this case is the number of rows in newdata.
If newdata is NULL, the returned value is a vector
of effective sampling distances associated with covariate
values in object. Length of return in this case
is the number of detected groups. The returned vector
has measurement units, i.e., object$outputUnits.
dfuncEstim(), EDR(),
effectiveDistance()
data(sparrowDfuncObserver) ESW(sparrowDfuncObserver) # vector length 356 = number of groups ESW(sparrowDfuncObserver, newdata = data.frame(observer = c("obs2", "obs4"))) # vector length 2data(sparrowDfuncObserver) ESW(sparrowDfuncObserver) # vector length 356 = number of groups ESW(sparrowDfuncObserver, newdata = data.frame(observer = c("obs2", "obs4"))) # vector length 2
The domain to which expansion factors are applied varies by likelihood.
When the domain varies, it depends on parameter values. For example,
oneStep expansion factors are applied between
0 and the likelihood's first parameter (), which
varies by covariates. This function computes the likelihood-specific
expansion domains
expandW(ml, params, k)expandW(ml, params, k)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
params |
A matrix of likelihood parameters. Size is (number of distances) X ((number of cases) + (number of non-covariate parameter)). |
k |
The number of cases. |
A plain vector of length k containing expansion domains, with units.
Compute "expansion" terms that modify the shape of a base distance function.
expansionTerms(a, d, series, nexp, w)expansionTerms(a, d, series, nexp, w)
a |
A vector or matrix of (estimated) coefficients.
|
d |
A vector or 1-column matrix of
distances at which to evaluate
the expansion terms. |
series |
If |
nexp |
Number of expansion terms. Integer from 0 to 5. |
w |
A vector specifying strip width for every 'case'
in |
Expansion terms modify the base likelihood function
and are used to incorporate "wiggle". The modified distance function is,
key * expTerms where key is a vector of values in
the base likelihood function (e.g., halfnorm.like()$L.unscaled)
and expTerms is the
matrix returned by this routine. In equation form,
,
where = the the number of expansions (nexp),
are expansion terms for distance , and
are the (estimated)
expansion term coefficients.
If nexp equals 0, 1 is returned.
If nexp is greater than 0, a matrix of
size X containing expansion terms,
where = length(d) and = nrow(a). The
expansion series associated with row of a
are in column of the return. i.e.,
element (,) of the return is
a1 <- c(log(40), 0.5, -.5) a2 <- c(log(40), 0.25, -.5) dists <- seq(0, 100, length = 100) %m%. w = 100 %m%. expTerms1 <- expansionTerms(a1, dists, "cosine", 2, w) expTerms2 <- expansionTerms(a2, dists, "cosine", 2, w) plot(dists, expTerms2, ylim = c(0,2.5)) points(dists, expTerms1, pch = 16) # Same as above a <- rbind(a1, a2) w <- rep(w, nrow(a)) expTerms <- expansionTerms(a, dists, "cosine", 2, w) matlines(dists, expTerms, lwd=2, col=c("red", "blue"), lty=1) # Showing key and expansions key <- halfnorm.like(log(40), dists, matrix(1,length(dists),1))$L.unscaled plot(dists, key, type = "l", col = "blue", ylim=c(0,1.5)) lines(dists, key * expTerms1, col = "red") lines(dists, key * expTerms2, col = "purple")a1 <- c(log(40), 0.5, -.5) a2 <- c(log(40), 0.25, -.5) dists <- seq(0, 100, length = 100) %m%. w = 100 %m%. expTerms1 <- expansionTerms(a1, dists, "cosine", 2, w) expTerms2 <- expansionTerms(a2, dists, "cosine", 2, w) plot(dists, expTerms2, ylim = c(0,2.5)) points(dists, expTerms1, pch = 16) # Same as above a <- rbind(a1, a2) w <- rep(w, nrow(a)) expTerms <- expansionTerms(a, dists, "cosine", 2, w) matlines(dists, expTerms, lwd=2, col=c("red", "blue"), lty=1) # Showing key and expansions key <- halfnorm.like(log(40), dists, matrix(1,length(dists),1))$L.unscaled plot(dists, key, type = "l", col = "blue", ylim=c(0,1.5)) lines(dists, key * expTerms1, col = "red") lines(dists, key * expTerms2, col = "purple")
Evaluate the gamma distance function for sighting distances, potentially including covariates and expansion terms
Gamma.like(a, dist, covars, w.hi = NULL)Gamma.like(a, dist, covars, w.hi = NULL)
a |
A vector or matrix of covariate
and expansion term
coefficients. If matrix, dimension is
k X p, where
k = |
dist |
A numeric vector of length n or a single-column matrix (dimension nX1) containing detection distances at which to evaluate the likelihood. |
covars |
A numeric vector of length q or a
matrix of dimension nXq containing
covariate values
associated with distances in argument |
w.hi |
A numeric scalar containing maximum distance. The right-hand cutoff or upper limit. Ignored by some likelihoods (such as halfnorm, negexp, and hazrate), but is a fixed parameter in other likelihoods (such as oneStep and uniform). |
The Rdistance implementation of a Gamma distance function follows Becker and Quang (2009). Rdistance's Gamma distance function is
where is the shape parameter, is
the scale parameter, and .
is the mode of the Gamma function, and in Rdistance it's
scaled to have a maximum of 1.0 at .
The scale parameter is a function of the shape parameter
and sighting covariates, i.e.,
where is a vector of covariate values associated with distance
(i.e., a row of covars), is a vector of the
first (=ncol(covars)) values of the first argument
of the function (a), and
is a function of the shape parameter, i.e.,
The shape parameter is the
-st value in the function's first argument and is constrained to
be strictly greater than 1.0.
See Examples for use of GammaReparam() to compute
and from fitted object coefficients.
A list containing the following two components:
L.unscaled: A matrix of size
nXk (n = length dist; k = number of
cases = nrow(a))
containing likelihood values evaluated at
distances in dist.
Each row is associated with
a single distance, and each column is associated with
a single case (row of a). Values in this matrix are
the distance function which generally have .
These values are "unscaled" likelihood values; they must be
scaled (divided by) with the area under between w.lo and w.hi
to form proper likelihood values.
params: A nXbXk array
of the likelihood's (canonical) parameters in link space (i.e., on
log scale; b = number of canonical parameters in
the likelihood; k = number of cases).
Rows correspond to distances in dist. Columns
correspond to parameters (columns of a),
and pages correspond to cases (rows of a).
Becker, E. F., and P. X. Quang, 2009. A Gamma-Shaped Detection Function for Line-Transect Surveys with Mark-Recapture and Covariate Data. Journal of Agricultural, Biological, and Environmental Statistics 14(2):207-223.
dfuncEstim(),
abundEstim(),
other <likelihood>.like functions
x <- seq(0, 100, length=100) covars <- matrix(1,100,1) # Plots showing changes in scale plot(x, Gamma.like(c(log(20),2.5), x, covars)$L.unscaled, type="l", col="red") lines(x, Gamma.like(c(log(40),2.5), x, covars)$L.unscaled, col="blue") # Plots showing changes in shape plot(x, Gamma.like(c(log(20),1.5), x, covars)$L.unscaled, type="l", col="red") lines(x, Gamma.like(c(log(20),2.5), x, covars)$L.unscaled, col="blue") lines(x, Gamma.like(c(log(20),4.5), x, covars)$L.unscaled, col="green") # Roll-your-own plot, showing "re-parameterization": # Assume fitted object coefficients are c(log(20), 4.5) fit <- list(par = c(log(20), 4.5)) # The distance function is then, gammaPar <- GammaReparam( scl = exp(fit$par[1]) , shp = fit$par[2] ) # returns scl=k*exp(x'B) scl <- gammaPar$scl shp <- gammaPar$shp m <- (shp - 1) * scl g <- (x / m)^(shp - 1) * exp(-(x - m) / scl) # distance function lines(x, g, lwd = 3, lty = 2, col="green3")x <- seq(0, 100, length=100) covars <- matrix(1,100,1) # Plots showing changes in scale plot(x, Gamma.like(c(log(20),2.5), x, covars)$L.unscaled, type="l", col="red") lines(x, Gamma.like(c(log(40),2.5), x, covars)$L.unscaled, col="blue") # Plots showing changes in shape plot(x, Gamma.like(c(log(20),1.5), x, covars)$L.unscaled, type="l", col="red") lines(x, Gamma.like(c(log(20),2.5), x, covars)$L.unscaled, col="blue") lines(x, Gamma.like(c(log(20),4.5), x, covars)$L.unscaled, col="green") # Roll-your-own plot, showing "re-parameterization": # Assume fitted object coefficients are c(log(20), 4.5) fit <- list(par = c(log(20), 4.5)) # The distance function is then, gammaPar <- GammaReparam( scl = exp(fit$par[1]) , shp = fit$par[2] ) # returns scl=k*exp(x'B) scl <- gammaPar$scl shp <- gammaPar$shp m <- (shp - 1) * scl g <- (x / m)^(shp - 1) * exp(-(x - m) / scl) # distance function lines(x, g, lwd = 3, lty = 2, col="green3")
Compute starting values and limits for the Gamma distance function.
Gamma.start.limits(ml)Gamma.start.limits(ml)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
A list containing the following components
start |
Vector of starting values for parameters of the likelihood and expansion terms. |
lowlimit |
Vector of lower limits for the likelihood parameters and expansion terms. |
uplimit |
Vector of upper limits for the likelihood parameters and expansion terms. |
names |
Vector of names for the likelihood parameters and expansion terms. |
The length of each vector in the return is:
(Num expansions) + 1 + 1*(like %in% c("hazrate")) + (Num Covars).
Compute mode (i.e., maximum) of Rdistance's version of the gamma distribution.
GammaModes(params)GammaModes(params)
params |
A matrix of Gamma distribution parameters. First column is the scale parameter, and is a function of covariates. Second column is the shape parameter. |
A vector of the locations of the gamma modes associated
with each row in the params matrix.
Transform Rdistance's version of the Gamma distribution parameters, which is that of Becker and Quan, into the version for use in R::dgamma() and elsewhere.
GammaReparam(shp, scl)GammaReparam(shp, scl)
shp |
Rdistance's shape parameter |
scl |
Rdistance's scale parameter. This parameter is related to covariates via exp(x'B). |
A list with components $shp and $scl, which are the re-parameterized versions of the input parameters suitable for us in R::dgamma().
'Details' section of Gamma.like()
for Rdistance's Gamma distribution
# Rdistance Gamma parameters Rd.scl <- 50 # must be >0 Rd.shp <- 1.5 # must be >1 # dgamma parameters dgParams <- GammaReparam(Rd.shp, Rd.scl) dgParams # Gamma distribution with (Rd.scl, Rd.shp) from 0 to 100 curve(dgamma(x, shape=dgParams$shp, scale = dgParams$scl) , from = 0 , to = 100) # Rdistance's version: same curve but scaled so maximum = 1 x <- seq(0, 100, length = 200) scl <- dgParams$scl shp <- dgParams$shp m <- (shp - 1) * scl g <- (x / m)^(shp - 1) * exp(-(x - m) / scl) # distance function plot(x, g, type = "l")# Rdistance Gamma parameters Rd.scl <- 50 # must be >0 Rd.shp <- 1.5 # must be >1 # dgamma parameters dgParams <- GammaReparam(Rd.shp, Rd.scl) dgParams # Gamma distribution with (Rd.scl, Rd.shp) from 0 to 100 curve(dgamma(x, shape=dgParams$shp, scale = dgParams$scl) , from = 0 , to = 100) # Rdistance's version: same curve but scaled so maximum = 1 x <- seq(0, 100, length = 200) scl <- dgParams$scl shp <- dgParams$shp m <- (shp - 1) * scl g <- (x / m)^(shp - 1) * exp(-(x - m) / scl) # distance function plot(x, g, type = "l")
Set the number of cores for parallel operations. Also convert integer 'parallel' to TRUE-FALSE for ease.
getNCores(parallel)getNCores(parallel)
parallel |
A logical scalar, or a positive integer; ignored unless
confidence intervals are requested (i.e., |
Input parallel <= 0 is converted to 1.
Input parallel > maxCores is converted to maxCores.
If input parallel is numeric, is first converted to integer
by rounding down.
A list with components $parallel and $cores. $parallel is logical, T for parallel operations, F o.w. $cores is the number of cores to use during parallel operations. $cores is integer in the range 1, 2, ..., max(Available cores).
Extract the group size information from an Rdistance model frame.
groupSizes(ml, ...)groupSizes(ml, ...)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
... |
Ignored |
A vector containing group sizes contained in the Rdistance model frame or fitted object.
data("sparrowDf") sparrowModel <- parseModel( sparrowDf, dist ~ observer ) stats::model.offset(sparrowModel$mf) groupSizes(sparrowModel) # same, but future-proof sparrowModel <- parseModel( sparrowDf , dist ~ observer + groupsize(groupsize) ) groupSizes(sparrowModel)data("sparrowDf") sparrowModel <- parseModel( sparrowDf, dist ~ observer ) stats::model.offset(sparrowModel$mf) groupSizes(sparrowModel) # same, but future-proof sparrowModel <- parseModel( sparrowDf , dist ~ observer + groupsize(groupsize) ) groupSizes(sparrowModel)
Estimate distance function scaling factor , g(0) or g(x), for a specified distance function.
gxEstim(fit)gxEstim(fit)
fit |
An estimated |
This routine scales sightability such that
g(x.scl) = g.x.scl, where g() is the sightability function.
Specification of x.scl and g.x.scl covers several estimation cases:
g(0) = 1 : (the default) Inputs are x.scl = 0, g.x.scl = 1.
If w.lo > 0, x.scl will be set to w.lo
so technically this case is g(w.low) = 1.
User supplied probability at specified distance: Inputs are
x.scl = a number greater than or equal
to w.lo, g.x.scl = a number between 0 and 1. This case
covers situations where sightability on the transect (distance 0) is
not perfect. This case assumes researchers have an independent
estimate of sightability at distance
x.scl off the transect. For example, researchers could be
using multiple
observers to estimate that sightability at distance x.scl
is g.x.scl.
Maximum sightability specified: Inputs
are x.scl="max", g.x.scl = a number
between 0 and 1. In this case,
g() is scaled such that its maximum value is g.x.scl.
This routine computes the distance at which g() is maximum, sets
g()'s height there to g.x.scl, and returns x.max where
x.max is the distance at which g is maximized. This case covers the
common aerial survey situation where maximum sightability is slightly
off the transect, but the distance at which the maximum occurs
is unknown.
Double observer system: Inputs are
x.scl="max", g.x.scl = \<a data frame\>.
In this case, g(x) = h, where x is the distance that
maximizes g and h is the height of g() at x
computed from the double observer data frame (see below for
structure of the double observer data frame).
Distance of independence specified, height computed from double
observer system: Inputs are
x.scl = a number greater than or equal to w.lo
g.x.scl = a data frame. In this case, g(x.scl) = h,
where h is computed from the double observer data frame
(see below for structure of the double observer data frame).
When x.scl, g.x.scl, or observer are NULL, the routine
will look for and use $call.x.scl, or $call.g.x.scl, or
$call.observer components of the fit object for whichever
of these three parameters is missing. Later, different
values can be specified in a direct call to F.gx.estim
without having to re-estimate the distance function. Because of this feature,
the default values in dfuncEstim are x.scl = 0 and
g.x.scl = 1 and observer = "both".
A list comprised of the following components:
x.scl |
The value of x (distance) at which g() is evaluated. |
comp2 |
The estimated value of g() when evaluated at |
When g.x.scl is a data frame, it is assumed to contain
the components $obsby.1 and $obsby.2 (no flexibility on names).
Each row in the data frame contains data from one sighted target.
The $obsby.1 and $obsby.2 components are
TRUE/FALSE (logical) vectors indicating whether
observer 1 (obsby.1) or observer 2 (obsby.2) spotted the target.
data(sparrowDf) fit <- dfuncEstim(sparrowDf, dist ~ groupsize(groupsize)) gxEstim(fit) fit <- dfuncEstim(sparrowDf, dist ~ groupsize(groupsize) , x.scl = 50 %m%. , g.x.scl = 0.75) gxEstim(fit) plot(fit) abline(h=0.75) abline(v=50%m%.)data(sparrowDf) fit <- dfuncEstim(sparrowDf, dist ~ groupsize(groupsize)) gxEstim(fit) fit <- dfuncEstim(sparrowDf, dist ~ groupsize(groupsize) , x.scl = 50 %m%. , g.x.scl = 0.75) gxEstim(fit) plot(fit) abline(h=0.75) abline(v=50%m%.)
Evaluate the half-normal distance function, for sighting distances, potentially including covariates and expansion terms
halfnorm.like(a, dist, covars, w.hi = NULL)halfnorm.like(a, dist, covars, w.hi = NULL)
a |
A vector or matrix of covariate
and expansion term
coefficients. If matrix, dimension is
k X p, where
k = |
dist |
A numeric vector of length n or a single-column matrix (dimension nX1) containing detection distances at which to evaluate the likelihood. |
covars |
A numeric vector of length q or a
matrix of dimension nXq containing
covariate values
associated with distances in argument |
w.hi |
A numeric scalar containing maximum distance. The right-hand cutoff or upper limit. Ignored by some likelihoods (such as halfnorm, negexp, and hazrate), but is a fixed parameter in other likelihoods (such as oneStep and uniform). |
The half-normal distance function is
where , is a vector of
covariate values associated with distance
(i.e., a row of covars), and
is a vector of the first $q$ (=ncol(covars))
values in argument a.
Some authors parameterize the halfnorm without
the "2" in the denominator of the exponent.
Rdistance includes
"2" in this denominator to make
quantiles of the half normal agree with
the standard normal. This means that half-normal
coefficients in
Rdistance (i.e., ) can be
interpreted as normal standard errors.
Approximately 95% of distances should
occur between 0 and 2.
A list containing the following two components:
L.unscaled: A matrix of size
nXk (n = length dist; k = number of
cases = nrow(a))
containing likelihood values evaluated at
distances in dist.
Each row is associated with
a single distance, and each column is associated with
a single case (row of a). Values in this matrix are
the distance function which generally have .
These values are "unscaled" likelihood values; they must be
scaled (divided by) with the area under between w.lo and w.hi
to form proper likelihood values.
params: A nXbXk array
of the likelihood's (canonical) parameters in link space (i.e., on
log scale; b = number of canonical parameters in
the likelihood; k = number of cases).
Rows correspond to distances in dist. Columns
correspond to parameters (columns of a),
and pages correspond to cases (rows of a).
dfuncEstim(),
abundEstim(),
other <likelihood>.like functions
d <- seq(0, 100, length=100) covs <- matrix(1,length(d),1) halfnorm.like(log(20), d, covs) plot(d, halfnorm.like(log(20), d, covs)$L.unscaled, type="l", col="red") lines(d, halfnorm.like(log(40), d, covs)$L.unscaled, col="blue") # Evaluate 3 functions at once using matrix of coefficients: # sigma ~ 20, 30, 40 coefs <- matrix(log(c(7.39,7.33, 4.48,44.80, 2.72,216.54)) , byrow = TRUE , ncol=2) # (3 coef vectors)X(2 covars) covs <- matrix(c(rep(1,length(d)) , rep(0.5,length(d))) , nrow = length(d)) # 100 X 2 L <- halfnorm.like( coefs, d, covs ) L$L.unscaled # 100 X (3 coef vectors) L$params # 100 X (3 coef vectors); ~ log(c(20,30,40)) matplot(d, L$L.unscaled, type="l")d <- seq(0, 100, length=100) covs <- matrix(1,length(d),1) halfnorm.like(log(20), d, covs) plot(d, halfnorm.like(log(20), d, covs)$L.unscaled, type="l", col="red") lines(d, halfnorm.like(log(40), d, covs)$L.unscaled, col="blue") # Evaluate 3 functions at once using matrix of coefficients: # sigma ~ 20, 30, 40 coefs <- matrix(log(c(7.39,7.33, 4.48,44.80, 2.72,216.54)) , byrow = TRUE , ncol=2) # (3 coef vectors)X(2 covars) covs <- matrix(c(rep(1,length(d)) , rep(0.5,length(d))) , nrow = length(d)) # 100 X 2 L <- halfnorm.like( coefs, d, covs ) L$L.unscaled # 100 X (3 coef vectors) L$params # 100 X (3 coef vectors); ~ log(c(20,30,40)) matplot(d, L$L.unscaled, type="l")
Compute starting values and limits for the half normal distance function.
halfnorm.start.limits(ml)halfnorm.start.limits(ml)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
A list containing the following components
start |
Vector of starting values for parameters of the likelihood and expansion terms. |
lowlimit |
Vector of lower limits for the likelihood parameters and expansion terms. |
uplimit |
Vector of upper limits for the likelihood parameters and expansion terms. |
names |
Vector of names for the likelihood parameters and expansion terms. |
The length of each vector in the return is:
(Num expansions) + 1 + 1*(like %in% c("hazrate")) + (Num Covars).
Computes the hazard rate distance function.
hazrate.like(a, dist, covars, w.hi = NULL)hazrate.like(a, dist, covars, w.hi = NULL)
a |
A vector or matrix of covariate
and expansion term
coefficients. If matrix, dimension is
k X p, where
k = |
dist |
A numeric vector of length n or a single-column matrix (dimension nX1) containing detection distances at which to evaluate the likelihood. |
covars |
A numeric vector of length q or a
matrix of dimension nXq containing
covariate values
associated with distances in argument |
w.hi |
A numeric scalar containing maximum distance. The right-hand cutoff or upper limit. Ignored by some likelihoods (such as halfnorm, negexp, and hazrate), but is a fixed parameter in other likelihoods (such as oneStep and uniform). |
The hazard rate likelihood is
where determines location
(i.e., distance at which the function equals 1 - exp(-1) = 0.632),
and determines slope of the function
at (i.e., larger k equals steeper
slope at ). For distance analysis,
the valid range for both and k is
.
A list containing the following two components:
L.unscaled: A matrix of size
nXk (n = length dist; k = number of
cases = nrow(a))
containing likelihood values evaluated at
distances in dist.
Each row is associated with
a single distance, and each column is associated with
a single case (row of a). Values in this matrix are
the distance function which generally have .
These values are "unscaled" likelihood values; they must be
scaled (divided by) with the area under between w.lo and w.hi
to form proper likelihood values.
params: A nXbXk array
of the likelihood's (canonical) parameters in link space (i.e., on
log scale; b = number of canonical parameters in
the likelihood; k = number of cases).
Rows correspond to distances in dist. Columns
correspond to parameters (columns of a),
and pages correspond to cases (rows of a).
dfuncEstim(),
abundEstim(),
other <likelihood>.like functions
d <- seq(0, 100, length=100) covs <- matrix(1,length(d),1) hazrate.like(c(log(20), 5), d, covs) # Changing location parameter plot(d, hazrate.like(c(log(20), 5), d, covs)$L.unscaled, type="l", col="red") lines(d, hazrate.like(c(log(40), 5), d, covs)$L.unscaled, col="blue") abline(h = 1 - exp(-1), lty = 2) abline(v = c(20,40), lty = 2) # Changing slope parameter plot(d, hazrate.like(c(log(50), 20), d, covs)$L.unscaled, type="l", col="red") lines(d, hazrate.like(c(log(50), 2), d, covs)$L.unscaled, col="blue") abline(h = 1 - exp(-1), lty = 2) abline(v = 50, lty = 2)d <- seq(0, 100, length=100) covs <- matrix(1,length(d),1) hazrate.like(c(log(20), 5), d, covs) # Changing location parameter plot(d, hazrate.like(c(log(20), 5), d, covs)$L.unscaled, type="l", col="red") lines(d, hazrate.like(c(log(40), 5), d, covs)$L.unscaled, col="blue") abline(h = 1 - exp(-1), lty = 2) abline(v = c(20,40), lty = 2) # Changing slope parameter plot(d, hazrate.like(c(log(50), 20), d, covs)$L.unscaled, type="l", col="red") lines(d, hazrate.like(c(log(50), 2), d, covs)$L.unscaled, col="blue") abline(h = 1 - exp(-1), lty = 2) abline(v = 50, lty = 2)
Compute starting values and limits for the hazard rate distance function.
hazrate.start.limits(ml)hazrate.start.limits(ml)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
A list containing the following components
start |
Vector of starting values for parameters of the likelihood and expansion terms. |
lowlimit |
Vector of lower limits for the likelihood parameters and expansion terms. |
uplimit |
Vector of upper limits for the likelihood parameters and expansion terms. |
names |
Vector of names for the likelihood parameters and expansion terms. |
The length of each vector in the return is:
(Num expansions) + 1 + 1*(like %in% c("hazrate")) + (Num Covars).
Computes Hermite expansion terms for use in distance analysis. The Hermite (and other expansions) allow "wiggle" in estimated distance functions.
hermite.expansion(x, expansions)hermite.expansion(x, expansions)
x |
A numeric matrix of distances at which to evaluate
the expansion series. For distance analysis, |
expansions |
A scalar specifying the number of expansion terms to compute. Must be one of the integers 1, 2, 3, 4, or 5. |
There are, in general, several expansions that can be called Hermite. Let .
Rdistance's Hermite expansions are:
First term:
Second term:
Third term:
Fourth term:
The maximum number of expansion terms computed is 4.
A 3D array of size nrow(x) X ncol(x) X expansions.
The 'pages' (3rd dimension) of this array are the cosine expansions of
x. i.e., page 1 is the first expansion term of x,
page 2 is the second expansion term of x, etc.
dfuncEstim()
, cosine.expansion()
, sine.expansion()
, simple.expansion().
x <- matrix(seq(0, 1, length = 200), ncol = 1) herm.expn <- hermite.expansion(x, 4) plot(range(x), range(herm.expn), type="n") matlines(x, herm.expn[,1,1:4], col=rainbow(4), lty = 1)x <- matrix(seq(0, 1, length = 200), ncol = 1) herm.expn <- hermite.expansion(x, 4) plot(range(x), range(herm.expn), type="n") matlines(x, herm.expn[,1,1:4], col=rainbow(4), lty = 1)
Call R native function 'nlminb' to perform optimization.
HookeJeeves(ml, strt.lims)HookeJeeves(ml, strt.lims)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
strt.lims |
A list containing start, low, and high limits for
parameters of the requested likelihood. This list is typically produced
by a call to |
A list with following named components:
par = parameters
loglik = objective function value at minimum
convergence = 0 for yes, other for no
iterations = number of iterations
evaluations = function evaluations
message = a convergence message
varcovar = a variance covariance matrix of parameters
limits = low and high limits
Computes the cumulative function of the huber likelihood. The
cumulative function is proportional to the huber
cumulative distribution function, differing only by appropriate
scaling constant.
huber.cumFunc(x, t1, t2, p, w)huber.cumFunc(x, t1, t2, p, w)
x |
A vector of distances. Can have units or not (i.e., regular numeric). |
t1 |
A vector of values for the |
t2 |
A vector of values for the |
p |
A vector of values for the |
w |
A vector of maximum strip widths, the maximum distance.
If |
A vector of values from the huber cumulative function.
The huber cumulative function is
where is Rdistance's huber
likelihood. The only difference between this cumulative function,
and the cumulative distribution function
is the scaling constant. That is, the maximum of the cumulative
function is greater than 1 while the maximum cumulative distribution function
is exactly 1.
d <- -10:210 # Cumulative function fd <- huber.cumFunc(d, 125, 25, .05, 200) plot(d, fd, type="l") # Cumulative distribution function Fd <- fd / huber.cumFunc(200, 125, 25, .05, 200)d <- -10:210 # Cumulative function fd <- huber.cumFunc(d, 125, 25, .05, 200) plot(d, fd, type="l") # Cumulative distribution function Fd <- fd / huber.cumFunc(200, 125, 25, .05, 200)
Computes the Huber likelihood for use as a distance function. The Huber likelihood is a mixture of an inverted Huber loss and a uniform distribution. The distance function is quadratic from the lowest distance to its first parameter, linear from the first parameter to the sum of its first and second parameter, and constant after that.
huber.like(a, dist, covars, w.hi = NULL)huber.like(a, dist, covars, w.hi = NULL)
a |
A vector or matrix of covariate
and expansion term
coefficients. If matrix, dimension is
k X p, where
k = |
dist |
A numeric vector of length n or a single-column matrix (dimension nX1) containing detection distances at which to evaluate the likelihood. |
covars |
A numeric vector of length q or a
matrix of dimension nXq containing
covariate values
associated with distances in argument |
w.hi |
A numeric scalar containing maximum distance. The right-hand cutoff or upper limit. Ignored by some likelihoods (such as halfnorm, negexp, and hazrate), but is a fixed parameter in other likelihoods (such as oneStep and uniform). |
The 'huber' likelihood is an inverted version of the
Huber loss function, mixed with a uniform at the upper end, and contains
three canonical parameters.
The 'huber' distance function is a negative quadratic
between 0 and its first parameter ,
linear between and , and
constant after .
Specifically, the 'huber' likelihood is,
where
, = w.hi - w.lo,
and h(d) is Huber loss between 0 and , i.e.,
The first parameter, , is related to covariates,
i.e., . and
are constant across covariate values.
A list containing the following two components:
L.unscaled: A matrix of size
nXk (n = length dist; k = number of
cases = nrow(a))
containing likelihood values evaluated at
distances in dist.
Each row is associated with
a single distance, and each column is associated with
a single case (row of a). Values in this matrix are
the distance function which generally have .
These values are "unscaled" likelihood values; they must be
scaled (divided by) with the area under between w.lo and w.hi
to form proper likelihood values.
params: A nXbXk array
of the likelihood's (canonical) parameters in link space (i.e., on
log scale; b = number of canonical parameters in
the likelihood; k = number of cases).
Rows correspond to distances in dist. Columns
correspond to parameters (columns of a),
and pages correspond to cases (rows of a).
See Rdistance tutorials for a method to generate random observations from the Huber likelihood.
t1 <- c(65,80,120) t2 <- 60 p <- 0.05 a <- matrix(c(log(t1) , rep(t2,3) , rep(p,3)) , 3,3) d <- seq(0, 200, length=201) X <- matrix(1,length(d),1) y <- huber.like(a, d, X) # Plot showing covariate effects plot(range(d), range(y$L.unscaled) , type = "n", xlab = "d", ylab = "Huber(d)") for(i in 1:3){ # Distance functions lines(d , y$L.unscaled[,i] , col = i , lwd= 2) # Quadradic to linear transitions points(exp(a[i,1]) , y$L.unscaled[(t1[i]-0.1) < d & d < (t1[i]+0.01),i] , pch = 16 , col = i ) # Linear to constant transition Theta <- exp(a[i,1]) + a[i,2] points(Theta , y$L.unscaled[(Theta-0.1) < d & d < (Theta+0.1),i] , pch = 15 , col = i ) }t1 <- c(65,80,120) t2 <- 60 p <- 0.05 a <- matrix(c(log(t1) , rep(t2,3) , rep(p,3)) , 3,3) d <- seq(0, 200, length=201) X <- matrix(1,length(d),1) y <- huber.like(a, d, X) # Plot showing covariate effects plot(range(d), range(y$L.unscaled) , type = "n", xlab = "d", ylab = "Huber(d)") for(i in 1:3){ # Distance functions lines(d , y$L.unscaled[,i] , col = i , lwd= 2) # Quadradic to linear transitions points(exp(a[i,1]) , y$L.unscaled[(t1[i]-0.1) < d & d < (t1[i]+0.01),i] , pch = 16 , col = i ) # Linear to constant transition Theta <- exp(a[i,1]) + a[i,2] points(Theta , y$L.unscaled[(Theta-0.1) < d & d < (Theta+0.1),i] , pch = 15 , col = i ) }
Computes starting values and limits for the 'huber' distance function.
huber.start.limits(ml)huber.start.limits(ml)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
A list containing the following components
start |
Vector of starting values for parameters of the likelihood and expansion terms. |
lowlimit |
Vector of lower limits for the likelihood parameters and expansion terms. |
uplimit |
Vector of upper limits for the likelihood parameters and expansion terms. |
names |
Vector of names for the likelihood parameters and expansion terms. |
The length of each vector in the return is:
(Num expansions) + 1 + 1*(like %in% c("hazrate")) + (Num Covars).
Compute break points in a oneStep likelihood and insert them into a sequence of distances. The idea is to insert a point just left and just right of the breaks so that they plot as vertical lines.
insertOneStepBreaks(obj, newData, xseq)insertOneStepBreaks(obj, newData, xseq)
obj |
A fitted Rdistance model object |
newData |
A data frame containing covariate values to use in prediction. |
xseq |
A vector of distances into which the break points are inserted. |
A vector like xseq, but with the break points
inserted.
Integrates under distances functions using exact integrals when possible. If exact integrals are not known, numerical integration is used.
integrateDfuncs(object, ml)integrateDfuncs(object, ml)
object |
Either an Rdistance fitted distance function
(an object that inherits from class "dfunc"; usually produced
by a call to |
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
Let K be the integral under distance function g(x) (i.e., the output from this routine). In distance analysis, the observation likelihood being evaluated for maximization is the density, f(x) = g(x)/K. K is a key quantity in distance analysis and is called the "effective sampling distance".
A vector of areas under the distance functions represented in
object.
If object is a distance function and
newdata is specified, the returned vector's length is
nrow(newdata). If object is a distance function and
newdata is NULL,
returned vector's length is length(distances(object)). If
object is a matrix, return's length is
nrow(object).
Users will not normally call this function. It is called
internally by nLL() and effectiveDistance().
# Faking a model frame ml <- list( likelihood = "halfnorm" , expansions = 0 , w.lo = 0 %m% . , w.hi = 100 %m% . , Units = "m" ) attr(ml, "transType") <- "line" parms <- matrix(75, nrow = 1) integrateDfuncs(parms, ml) # check: Normal, 0 to 100, sd = 75, scaled to mode = 1 (pnorm(q = 100, mean = 0, sd = 75) - 0.5) * sqrt(2*pi)*75# Faking a model frame ml <- list( likelihood = "halfnorm" , expansions = 0 , w.lo = 0 %m% . , w.hi = 100 %m% . , Units = "m" ) attr(ml, "transType") <- "line" parms <- matrix(75, nrow = 1) integrateDfuncs(parms, ml) # check: Normal, 0 to 100, sd = 75, scaled to mode = 1 (pnorm(q = 100, mean = 0, sd = 75) - 0.5) * sqrt(2*pi)*75
Compute integral of the Gamma distance function for line-transect surveys.
integrateGammaLines( object, newdata = NULL, w.lo = NULL, w.hi = NULL, Units = NULL )integrateGammaLines( object, newdata = NULL, w.lo = NULL, w.hi = NULL, Units = NULL )
object |
Either an Rdistance fitted distance function
(an object that inherits from class "dfunc"; usually produced
by a call to |
newdata |
A data frame containing new values for
the distance function covariates. If NULL and
|
w.lo |
Minimum sighting distance or left-truncation value
if |
w.hi |
Maximum sighting distance or right-truncation value
if |
Units |
Physical units of sighting distances if
|
#' Returned integrals are
where , is the i-th estimated scale
parameter for the Gamma distance function, and is the mode of Gamma
(i.e., .
Rdistance computes the integral using R's base function
pgamma(), which for all intents and purposes is exact.
See also Gamma.like().
A vector of areas under the distance functions represented in
object.
If object is a distance function and
newdata is specified, the returned vector's length is
nrow(newdata). If object is a distance function and
newdata is NULL,
returned vector's length is length(distances(object)). If
object is a matrix, return's length is
nrow(object).
Users will not normally call this function. It is called
internally by nLL() and effectiveDistance().
integrateNumeric(); integrateNegexpLines();
integrateOneStepLines()
# Fake distance function object w/ minimum inputs for integration d <- rep(1,4) %m%. # Only units needed, not values obs <- factor(rep(c("obs1", "obs2"), 2)) beta <- c(4.0, -0.5, 1.5) # {'Intercept', b_1, shape} w.hi <- 125 w.lo <- 20 ml <- list( mf = model.frame(d ~ obs) , par = beta , likelihood = "Gamma" , w.lo = w.lo %#% "m" , w.hi = w.hi %#% "m" , expansions = 0 ) class(ml) <- "dfunc" integrateGammaLines(ml) # Check: Integral of Gamma density from 0 to w.hi-w.lo b <- exp(c(beta[1], beta[1] + beta[2])) B <- Rdistance::GammaReparam(shp = beta[3], scl = b) m <- (B$shp - 1)*B$scl f.at.m <- dgamma(m, shape = B$shp, scale = B$scl) intgral <- pgamma(q = w.hi - w.lo, shape = B$shp, scale = B$scl) / f.at.m intgral# Fake distance function object w/ minimum inputs for integration d <- rep(1,4) %m%. # Only units needed, not values obs <- factor(rep(c("obs1", "obs2"), 2)) beta <- c(4.0, -0.5, 1.5) # {'Intercept', b_1, shape} w.hi <- 125 w.lo <- 20 ml <- list( mf = model.frame(d ~ obs) , par = beta , likelihood = "Gamma" , w.lo = w.lo %#% "m" , w.hi = w.hi %#% "m" , expansions = 0 ) class(ml) <- "dfunc" integrateGammaLines(ml) # Check: Integral of Gamma density from 0 to w.hi-w.lo b <- exp(c(beta[1], beta[1] + beta[2])) B <- Rdistance::GammaReparam(shp = beta[3], scl = b) m <- (B$shp - 1)*B$scl f.at.m <- dgamma(m, shape = B$shp, scale = B$scl) intgral <- pgamma(q = w.hi - w.lo, shape = B$shp, scale = B$scl) / f.at.m intgral
Compute integral of the half-normal distance function for line-transect surveys.
integrateHalfnormLines( object, newdata = NULL, w.lo = NULL, w.hi = NULL, Units = NULL )integrateHalfnormLines( object, newdata = NULL, w.lo = NULL, w.hi = NULL, Units = NULL )
object |
Either an Rdistance fitted distance function
(an object that inherits from class "dfunc"; usually produced
by a call to |
newdata |
A data frame containing new values for
the distance function covariates. If NULL and
|
w.lo |
Minimum sighting distance or left-truncation value
if |
w.hi |
Maximum sighting distance or right-truncation value
if |
Units |
Physical units of sighting distances if
|
Returned integrals are
where , is the estimated half-normal distance
function parameter for the
i-th observed distance, and is the standard normal
cumulative probability function.
Rdistance uses R's base
function pnorm(), which for all intents and purposes is exact.
A vector of areas under the distance functions represented in
object.
If object is a distance function and
newdata is specified, the returned vector's length is
nrow(newdata). If object is a distance function and
newdata is NULL,
returned vector's length is length(distances(object)). If
object is a matrix, return's length is
nrow(object).
Users will not normally call this function. It is called
internally by nLL() and effectiveDistance().
integrateNumeric(); integrateNegexpLines();
integrateOneStepLines()
# Fake distance function object w/ minimum inputs for integration d <- rep(1,4) %m%. # Only units needed, not values obs <- factor(rep(c("obs1", "obs2"), 2)) beta <- c(3.5, -0.5) w.hi <- 125 w.lo <- 20 ml <- list( mf = model.frame(d ~ obs) , par = beta , likelihood = "halfnorm" , w.lo = w.lo %#% "m" , w.hi = w.hi %#% "m" , expansions = 0 ) class(ml) <- "dfunc" integrateHalfnormLines(ml) # Check: Integral of exp(-x^2/(2*s^2)) from 0 to w.hi-w.lo b <- exp(c(beta[1], beta[1] + beta[2])) intgral <- (pnorm(w.hi, mean=w.lo, sd = b) - 0.5) * sqrt(2*pi)*b intgral# Fake distance function object w/ minimum inputs for integration d <- rep(1,4) %m%. # Only units needed, not values obs <- factor(rep(c("obs1", "obs2"), 2)) beta <- c(3.5, -0.5) w.hi <- 125 w.lo <- 20 ml <- list( mf = model.frame(d ~ obs) , par = beta , likelihood = "halfnorm" , w.lo = w.lo %#% "m" , w.hi = w.hi %#% "m" , expansions = 0 ) class(ml) <- "dfunc" integrateHalfnormLines(ml) # Check: Integral of exp(-x^2/(2*s^2)) from 0 to w.hi-w.lo b <- exp(c(beta[1], beta[1] + beta[2])) intgral <- (pnorm(w.hi, mean=w.lo, sd = b) - 0.5) * sqrt(2*pi)*b intgral
Compute integral of the half-normal distance function for point surveys.
integrateHalfnormPoints( object, newdata = NULL, w.lo = NULL, w.hi = NULL, Units = NULL )integrateHalfnormPoints( object, newdata = NULL, w.lo = NULL, w.hi = NULL, Units = NULL )
object |
Either an Rdistance fitted distance function
(an object that inherits from class "dfunc"; usually produced
by a call to |
newdata |
A data frame containing new values for
the distance function covariates. If NULL and
|
w.lo |
Minimum sighting distance or left-truncation value
if |
w.hi |
Maximum sighting distance or right-truncation value
if |
Units |
Physical units of sighting distances if
|
Returned integrals are
where and is the estimated half-normal
distance function parameter for the i-th observed distance.
A vector of areas under the distance functions represented in
object.
If object is a distance function and
newdata is specified, the returned vector's length is
nrow(newdata). If object is a distance function and
newdata is NULL,
returned vector's length is length(distances(object)). If
object is a matrix, return's length is
nrow(object).
Users will not normally call this function. It is called
internally by nLL() and effectiveDistance().
integrateNumeric(); integrateNegexpPoints();
integrateOneStepPoints()
# Fake distance function object w/ minimum inputs for integration d <- rep(1,4) %m%. # Only units needed, not values obs <- factor(rep(c("obs1", "obs2"), 2)) beta <- c(3.5, -0.5) w.hi <- 125 w.lo <- 20 ml <- list( mf = model.frame(d ~ obs) , par = beta , likelihood = "halfnorm" , w.lo = w.lo %#% "m" , w.hi = w.hi %#% "m" , expansions = 0 ) class(ml) <- "dfunc" integrateHalfnormPoints(ml) # Check: Integral of x exp(-x^2/(2*s^2)) from 0 to w = w.hi-w.lo sigma <- exp(c(beta[1], beta[1] + beta[2])) w <- w.hi - w.lo intgral <- sigma^2 * (1 - exp(-w^2 / (2*sigma^2))) intgral # Effective detection radius sqrt(2 * intgral)# Fake distance function object w/ minimum inputs for integration d <- rep(1,4) %m%. # Only units needed, not values obs <- factor(rep(c("obs1", "obs2"), 2)) beta <- c(3.5, -0.5) w.hi <- 125 w.lo <- 20 ml <- list( mf = model.frame(d ~ obs) , par = beta , likelihood = "halfnorm" , w.lo = w.lo %#% "m" , w.hi = w.hi %#% "m" , expansions = 0 ) class(ml) <- "dfunc" integrateHalfnormPoints(ml) # Check: Integral of x exp(-x^2/(2*s^2)) from 0 to w = w.hi-w.lo sigma <- exp(c(beta[1], beta[1] + beta[2])) w <- w.hi - w.lo intgral <- sigma^2 * (1 - exp(-w^2 / (2*sigma^2))) intgral # Effective detection radius sqrt(2 * intgral)
Compute integral of the hazard-rate distance function for line-transect surveys.
integrateHazrateLines( object, newdata = NULL, w.lo = NULL, w.hi = NULL, Units = NULL )integrateHazrateLines( object, newdata = NULL, w.lo = NULL, w.hi = NULL, Units = NULL )
object |
Either an Rdistance fitted distance function
(an object that inherits from class "dfunc"; usually produced
by a call to |
newdata |
A data frame containing new values for
the distance function covariates. If NULL and
|
w.lo |
Minimum sighting distance or left-truncation value
if |
w.hi |
Maximum sighting distance or right-truncation value
if |
Units |
Physical units of sighting distances if
|
Returned integrals are
where , and are estimated
hazard-rate distance
function parameters for the
i-th observed distance, and is the incomplete gamma
function.
Rdistance uses the
incomplete gamma function implemented in
expint::gammainc(), which for
all intents and purposes is exact.
A vector of areas under the distance functions represented in
object.
If object is a distance function and
newdata is specified, the returned vector's length is
nrow(newdata). If object is a distance function and
newdata is NULL,
returned vector's length is length(distances(object)). If
object is a matrix, return's length is
nrow(object).
Users will not normally call this function. It is called
internally by nLL() and effectiveDistance().
integrateNumeric(); integrateNegexpLines();
integrateOneStepLines()
# A pre-estimated hazard rate distance function: sparrowDfuncObserver fit <- sparrowDfuncObserver table(ESW(fit)) table(integrateHazrateLines(fit)) # Check: Integral of 1 - exp(-(x/s)^(-k)) from 0 to w.hi-w.lo w <- dropUnits(fit$w.hi - fit$w.lo) params <- predict(fit) sigma <- params[,1] minusk <- -params[,2] outArea <- w + sigma * expint::gammainc(1/minusk, (w/sigma)^(minusk)) / minusk table(outArea)# A pre-estimated hazard rate distance function: sparrowDfuncObserver fit <- sparrowDfuncObserver table(ESW(fit)) table(integrateHazrateLines(fit)) # Check: Integral of 1 - exp(-(x/s)^(-k)) from 0 to w.hi-w.lo w <- dropUnits(fit$w.hi - fit$w.lo) params <- predict(fit) sigma <- params[,1] minusk <- -params[,2] outArea <- w + sigma * expint::gammainc(1/minusk, (w/sigma)^(minusk)) / minusk table(outArea)
Compute exact integral of the 'huber' distance function for line transects.
integrateHuberLines( object, newdata = NULL, w.lo = NULL, w.hi = NULL, Units = NULL )integrateHuberLines( object, newdata = NULL, w.lo = NULL, w.hi = NULL, Units = NULL )
object |
Either an Rdistance fitted distance function
(an object that inherits from class "dfunc"; usually produced
by a call to |
newdata |
A data frame containing new values for
the distance function covariates. If NULL and
|
w.lo |
Minimum sighting distance or left-truncation value
if |
w.hi |
Maximum sighting distance or right-truncation value
if |
Units |
Physical units of sighting distances if
|
A vector of areas under the distance functions represented in
object.
If object is a distance function and
newdata is specified, the returned vector's length is
nrow(newdata). If object is a distance function and
newdata is NULL,
returned vector's length is length(distances(object)). If
object is a matrix, return's length is
nrow(object).
Users will not normally call this function. It is called
internally by nLL() and effectiveDistance().
integrateNumeric(); integrateNegexpLines();
integrateHalfnormLines()
w <- 250 T1 <- 160 T2 <- 20 p <- 0.03 obj <- matrix(c(T1,T2,p), 1, 3) integrateHuberLines(obj , w.lo = units::set_units(0,"m") , w.hi = units::set_units(w,"m") , Units = "m") # same huber.cumFunc(w,T1,T2,p,w) # check by numeric integration hubLike <- function(d, T1, T2, p, wl, wh) { y <- huber.like(a = c(log(T1), T2, p) , dist = d - wl , covars = matrix(1, length(d)) , w.hi = wh - wl)$L.unscaled y } integrate(hubLike, lower = 0, upper = w, T1 = T1 , T2 = T2, p = p, wl = 0, wh = w)w <- 250 T1 <- 160 T2 <- 20 p <- 0.03 obj <- matrix(c(T1,T2,p), 1, 3) integrateHuberLines(obj , w.lo = units::set_units(0,"m") , w.hi = units::set_units(w,"m") , Units = "m") # same huber.cumFunc(w,T1,T2,p,w) # check by numeric integration hubLike <- function(d, T1, T2, p, wl, wh) { y <- huber.like(a = c(log(T1), T2, p) , dist = d - wl , covars = matrix(1, length(d)) , w.hi = wh - wl)$L.unscaled y } integrate(hubLike, lower = 0, upper = w, T1 = T1 , T2 = T2, p = p, wl = 0, wh = w)
Check several integration characteristics, and report them to screen. This was designed to print info when user asks for higher verbose level. The scaled distance function should integrate to 1.0 every iteration of maximization, and this function checks that.
integrateKey(ml, key, f0, plot = FALSE)integrateKey(ml, key, f0, plot = FALSE)
ml |
The fitted object or model list |
key |
The scaled distance function. |
f0 |
The value of f(0) when integrating line transects. This should be the ESW for the case. |
plot |
Logical scalar. If TRUE, plot a diagnostic consisting of the distance function and approximating points we use for quadrature. |
Nothing. Prints information on integrals to the screen.
Compute integral of the negative exponential distance function.
integrateNegexpLines( object, newdata = NULL, w.lo = NULL, w.hi = NULL, Units = NULL )integrateNegexpLines( object, newdata = NULL, w.lo = NULL, w.hi = NULL, Units = NULL )
object |
Either an Rdistance fitted distance function
(an object that inherits from class "dfunc"; usually produced
by a call to |
newdata |
A data frame containing new values for
the distance function covariates. If NULL and
|
w.lo |
Minimum sighting distance or left-truncation value
if |
w.hi |
Maximum sighting distance or right-truncation value
if |
Units |
Physical units of sighting distances if
|
Returned integrals are
where and is the estimated
negative exponential distance
function parameter for the
i-th observed distance.
A vector of areas under the distance functions represented in
object.
If object is a distance function and
newdata is specified, the returned vector's length is
nrow(newdata). If object is a distance function and
newdata is NULL,
returned vector's length is length(distances(object)). If
object is a matrix, return's length is
nrow(object).
Users will not normally call this function. It is called
internally by nLL() and effectiveDistance().
integrateNumeric(); integrateNegexpPoints();
integrateOneStepLines()
# Fake distance function object w/ minimum inputs for integration d <- rep(1,4) %#% "m" # Only units needed, not values obs <- factor(rep(c("obs1", "obs2"), 2)) beta <- c(-5, -0.5) w.hi <- 125 w.lo <- 20 ml <- list( mf = model.frame(d ~ obs) , par = beta , likelihood = "negexp" , w.lo = w.lo %#% "m" , w.hi = w.hi %#% "m" , expansions = 0 ) class(ml) <- "dfunc" integrateNegexpLines(ml) # Check: Integral of exp(-bx) from 0 to w.hi-w.lo b <- c(exp(beta[1]), exp(beta[1] + beta[2])) intgral <- (1 - exp(-b*(w.hi - w.lo))) / b intgral# Fake distance function object w/ minimum inputs for integration d <- rep(1,4) %#% "m" # Only units needed, not values obs <- factor(rep(c("obs1", "obs2"), 2)) beta <- c(-5, -0.5) w.hi <- 125 w.lo <- 20 ml <- list( mf = model.frame(d ~ obs) , par = beta , likelihood = "negexp" , w.lo = w.lo %#% "m" , w.hi = w.hi %#% "m" , expansions = 0 ) class(ml) <- "dfunc" integrateNegexpLines(ml) # Check: Integral of exp(-bx) from 0 to w.hi-w.lo b <- c(exp(beta[1]), exp(beta[1] + beta[2])) intgral <- (1 - exp(-b*(w.hi - w.lo))) / b intgral
Compute integral of the negative exponential distance function for point surveys
integrateNegexpPoints( object, newdata = NULL, w.lo = NULL, w.hi = NULL, Units = NULL )integrateNegexpPoints( object, newdata = NULL, w.lo = NULL, w.hi = NULL, Units = NULL )
object |
Either an Rdistance fitted distance function
(an object that inherits from class "dfunc"; usually produced
by a call to |
newdata |
A data frame containing new values for
the distance function covariates. If NULL and
|
w.lo |
Minimum sighting distance or left-truncation value
if |
w.hi |
Maximum sighting distance or right-truncation value
if |
Units |
Physical units of sighting distances if
|
Returned integrals are
where and is the estimated
negative exponential distance
function parameter for the
i-th observed distance.
A vector of areas under the distance functions represented in
object.
If object is a distance function and
newdata is specified, the returned vector's length is
nrow(newdata). If object is a distance function and
newdata is NULL,
returned vector's length is length(distances(object)). If
object is a matrix, return's length is
nrow(object).
Users will not normally call this function. It is called
internally by nLL() and effectiveDistance().
integrateNumeric(); integrateNegexpLines()
# Fake distance function object w/ minimum inputs for integration d <- rep(1,4) %#% "m" # Only units needed, not values obs <- factor(rep(c("obs1", "obs2"), 2)) beta <- c(-5, -0.5) w.hi <- 125 w.lo <- 20 ml <- list( mf = model.frame(d ~ obs) , par = beta , likelihood = "negexp" , w.lo = w.lo %#% "m" , w.hi = w.hi %#% "m" , expansions = 0 ) class(ml) <- "dfunc" integrateNegexpPoints(ml) # Check: Integral of x*exp(-bx) from 0 to w.hi-w.lo b <- c(exp(beta[1]), exp(beta[1] + beta[2])) intgral <- (1 - exp(-b*(w.hi - w.lo)) * (b*(w.hi - w.lo) + 1)) / (b^2) intgral# Fake distance function object w/ minimum inputs for integration d <- rep(1,4) %#% "m" # Only units needed, not values obs <- factor(rep(c("obs1", "obs2"), 2)) beta <- c(-5, -0.5) w.hi <- 125 w.lo <- 20 ml <- list( mf = model.frame(d ~ obs) , par = beta , likelihood = "negexp" , w.lo = w.lo %#% "m" , w.hi = w.hi %#% "m" , expansions = 0 ) class(ml) <- "dfunc" integrateNegexpPoints(ml) # Check: Integral of x*exp(-bx) from 0 to w.hi-w.lo b <- c(exp(beta[1]), exp(beta[1] + beta[2])) intgral <- (1 - exp(-b*(w.hi - w.lo)) * (b*(w.hi - w.lo) + 1)) / (b^2) intgral
Numerically integrate under a distance function.
integrateNumeric( object, newdata = NULL, w.lo = NULL, w.hi = NULL, Units = NULL, expansions = NULL, series = NULL, isPoints = NULL, likelihood = NULL )integrateNumeric( object, newdata = NULL, w.lo = NULL, w.hi = NULL, Units = NULL, expansions = NULL, series = NULL, isPoints = NULL, likelihood = NULL )
object |
Either an Rdistance fitted distance function
(an object that inherits from class "dfunc"; usually produced
by a call to |
newdata |
A data frame containing new values for
the distance function covariates. If NULL and
|
w.lo |
Minimum sighting distance or left-truncation value
if |
w.hi |
Maximum sighting distance or right-truncation value
if |
Units |
Physical units of sighting distances if
|
expansions |
A scalar specifying the number of terms
in |
series |
If |
isPoints |
Boolean. TRUE if integration is for point surveys. FALSE for line-transect surveys. Line-transect surveys integrate under the distance function, g(x), while point surveys integrate under the distance function times distances, xg(x). |
likelihood |
String specifying the likelihood to fit. Built-in likelihoods at present are "halfnorm", "hazrate", and "negexp". |
A vector of areas under the distance functions represented in
object.
If object is a distance function and
newdata is specified, the returned vector's length is
nrow(newdata). If object is a distance function and
newdata is NULL,
returned vector's length is length(distances(object)). If
object is a matrix, return's length is
nrow(object).
Rdistance uses Simpson's composite 1/3 rule to numerically
integrate distance functions from 0 to the maximum sighting distance
(w.hi - w.lo). The number of points evaluated
during numerical integration is controlled by
options(Rdistance_intEvalPts) (default 101).
Option Rdistance_intEvalPts must be odd because Simpson's rule
requires an even number of intervals.
Lower values of Rdistance_intEvalPts increase calculation speeds;
but, decrease accuracy.
Rdistance_intEvalPts must be >= 5. A warning is thrown if
Rdistance_intEvalPts < 29. Empirical tests by the author
suggest Rdistance_intEvalPts values >= 30 are accurate
to several decimal points for smooth distance functions
(e.g., hazrate, halfnorm, negexp)
and that all Rdistance_intEvalPts >= 101 produce
identical results if the distance function is not smooth.
Details: Let n = options(Rdistance_intEvalPts).
Evaluate the distance function at n equal-spaced
locations {f(x0), f(x1), ..., f(xn)} between 0 and (w.hi - w.lo).
Simpson's composite approximation to the area under the curve is
where is the interval size (w.hi - w.lo) / n.
Physical units on the return values
are the original (linear) units if object contains line-transect data
(e.g., [m]), or square of the original units if object contains
point-transect data (e.g., [m^2]). Point-transect units are squared because
the likelihood is the product of the detection function (which is unitless)
and distances (which have units).
Users will not normally call this function. It is called
internally by nLL() and effectiveDistance().
# A halfnorm distance function fit <- dfuncEstim(sparrowDf, dist~1, likelihood = "halfnorm") exact <- integrateHalfnormLines(fit)[1,] # exact area apprx <- integrateNumeric(fit)[1] # Numeric approx pd <- options(digits = 20) cbind(exact, apprx) absDiff <- abs(apprx - exact) absDiff options(pd) # halfnorm approx good to this number of digits round(log10(absDiff),1)# A halfnorm distance function fit <- dfuncEstim(sparrowDf, dist~1, likelihood = "halfnorm") exact <- integrateHalfnormLines(fit)[1,] # exact area apprx <- integrateNumeric(fit)[1] # Numeric approx pd <- options(digits = 20) cbind(exact, apprx) absDiff <- abs(apprx - exact) absDiff options(pd) # halfnorm approx good to this number of digits round(log10(absDiff),1)
Compute exact integral of the one-step distance function for line transects.
integrateOneStepLines(object, newdata = NULL, Units = NULL)integrateOneStepLines(object, newdata = NULL, Units = NULL)
object |
Either an Rdistance fitted distance function
(an object that inherits from class "dfunc"; usually produced
by a call to |
newdata |
A data frame containing new values for
the distance function covariates. If NULL and
|
Units |
Physical units of sighting distances if
|
Returned integrals are
where , is the estimated one-step
distance function
threshold for the i-th observed distance, and is the estimated
one-step proportion.
A vector of areas under the distance functions represented in
object.
If object is a distance function and
newdata is specified, the returned vector's length is
nrow(newdata). If object is a distance function and
newdata is NULL,
returned vector's length is length(distances(object)). If
object is a matrix, return's length is
nrow(object).
Users will not normally call this function. It is called
internally by nLL() and effectiveDistance().
integrateNumeric(); integrateNegexpLines();
integrateHalfnormLines()
# A oneStep distance function on simulated data whi <- 250 T <- 100 # true threshold p <- 0.85 # true proportion <T n <- 200 # num simulated points x <- c( runif(n*p, min=0, max=T), runif(n*(1-p), min=T, max=whi)) x <- setUnits(x, "m") tranID <- sample(rep(1:10, each=n/10), replace=FALSE) detectDf <- data.frame(transect = tranID, dist = x) siteDf <- data.frame(transect = 1:10 , length = rep(setUnits(10,"m"), 10)) distDf <- RdistDf(siteDf, detectDf) # Estimation fit <- dfuncEstim(distDf , formula = dist ~ 1 , likelihood = "oneStep" , w.hi = setUnits(whi, "m") ) table(integrateOneStepLines(fit)) table(ESW(fit)) # Check: T.hat <- exp(fit$par[1]) p.hat <- fit$par[2] gAtT <- ((1-p.hat) * T.hat) / (p.hat * (whi - T.hat)) plot(fit) abline(h = gAtT, col="blue") areaLE.T <- (1.0) * T.hat areaGT.T <- gAtT * (whi - T.hat) areaLE.T + areaGT.T # ESW # Equivalent T.hat / p.hat# A oneStep distance function on simulated data whi <- 250 T <- 100 # true threshold p <- 0.85 # true proportion <T n <- 200 # num simulated points x <- c( runif(n*p, min=0, max=T), runif(n*(1-p), min=T, max=whi)) x <- setUnits(x, "m") tranID <- sample(rep(1:10, each=n/10), replace=FALSE) detectDf <- data.frame(transect = tranID, dist = x) siteDf <- data.frame(transect = 1:10 , length = rep(setUnits(10,"m"), 10)) distDf <- RdistDf(siteDf, detectDf) # Estimation fit <- dfuncEstim(distDf , formula = dist ~ 1 , likelihood = "oneStep" , w.hi = setUnits(whi, "m") ) table(integrateOneStepLines(fit)) table(ESW(fit)) # Check: T.hat <- exp(fit$par[1]) p.hat <- fit$par[2] gAtT <- ((1-p.hat) * T.hat) / (p.hat * (whi - T.hat)) plot(fit) abline(h = gAtT, col="blue") areaLE.T <- (1.0) * T.hat areaGT.T <- gAtT * (whi - T.hat) areaLE.T + areaGT.T # ESW # Equivalent T.hat / p.hat
Compute integral of the one-step distance function using numeric integration. This function is only called for oneStep functions that contain expansion factors.
integrateOneStepNumeric( object, newdata = NULL, w.lo = NULL, w.hi = NULL, Units = NULL, expansions = NULL, series = NULL, isPoints = NULL )integrateOneStepNumeric( object, newdata = NULL, w.lo = NULL, w.hi = NULL, Units = NULL, expansions = NULL, series = NULL, isPoints = NULL )
object |
Either an Rdistance fitted distance function
(an object that inherits from class "dfunc"; usually produced
by a call to |
newdata |
A data frame containing new values for
the distance function covariates. If NULL and
|
w.lo |
Minimum sighting distance or left-truncation value
if |
w.hi |
Maximum sighting distance or right-truncation value
if |
Units |
Physical units of sighting distances if
|
expansions |
A scalar specifying the number of terms
in |
series |
If |
isPoints |
Boolean. TRUE if integration is for point surveys. FALSE for line-transect surveys. Line-transect surveys integrate under the distance function, g(x), while point surveys integrate under the distance function times distances, xg(x). |
The oneStep.like() function has an extremely large
discontinuity at Theta. Accurate numeric integration requires
inserting Theta and Theta+ (a value just larger than Theta)
into the series of points being evaluated. Because this creates
variable sized intervals, the Trapezoid rule must be used.
Rdistance's Simpson's rule routine
(integrateNumeric()) will not work for oneStep likelihoods
that have expansions.
A vector of areas under the distance functions represented in
object.
If object is a distance function and
newdata is specified, the returned vector's length is
nrow(newdata). If object is a distance function and
newdata is NULL,
returned vector's length is length(distances(object)). If
object is a matrix, return's length is
nrow(object).
Users will not normally call this function. It is called
internally by nLL() and effectiveDistance().
integrateNumeric();
integrateOneStepLines(); integrateOneStepPoints()
Compute integral of the one-step distance function for point-surveys.
integrateOneStepPoints( object, newdata = NULL, w.lo = NULL, w.hi = NULL, Units = NULL )integrateOneStepPoints( object, newdata = NULL, w.lo = NULL, w.hi = NULL, Units = NULL )
object |
Either an Rdistance fitted distance function
(an object that inherits from class "dfunc"; usually produced
by a call to |
newdata |
A data frame containing new values for
the distance function covariates. If NULL and
|
w.lo |
Minimum sighting distance or left-truncation value
if |
w.hi |
Maximum sighting distance or right-truncation value
if |
Units |
Physical units of sighting distances if
|
Returned integrals are
where , is the estimated one-step
distance function
threshold for the i-th observed distance, and is the estimated
one-step proportion.
A vector of areas under the distance functions represented in
object.
If object is a distance function and
newdata is specified, the returned vector's length is
nrow(newdata). If object is a distance function and
newdata is NULL,
returned vector's length is length(distances(object)). If
object is a matrix, return's length is
nrow(object).
Users will not normally call this function. It is called
internally by nLL() and effectiveDistance().
integrateNumeric(); integrateOneStepNumeric();
integrateOneStepLines()
fit <- dfuncEstim(thrasherDf, dist~1, likelihood = "oneStep") integrateOneStepPoints(fit, newdata = data.frame(`(Intercept)`=1)) EDR(fit, newdata = data.frame(`(Intercept)`=1)) # Check: Theta <- exp(fit$par[1]) Theta <- setUnits(Theta, "m") p <- fit$par[2] w.hi <- fit$w.hi w.lo <- fit$w.lo g.at0 <- w.lo g.atT <- Theta g.atTPlusFuzz <- (((1-p) * Theta) / ((w.hi - Theta) * p))*Theta g.atWhi <- (((1-p) * Theta) / ((w.hi - Theta) * p))*w.hi area.0.to.T <- (Theta - w.lo) * (g.atT - g.at0) / 2 # triangle; Theta^2/2 area.T.to.w <- (w.hi - Theta) * (g.atTPlusFuzz + g.atWhi) / 2 # trapazoid area <- area.0.to.T + area.T.to.w edr <- sqrt( 2*area )fit <- dfuncEstim(thrasherDf, dist~1, likelihood = "oneStep") integrateOneStepPoints(fit, newdata = data.frame(`(Intercept)`=1)) EDR(fit, newdata = data.frame(`(Intercept)`=1)) # Check: Theta <- exp(fit$par[1]) Theta <- setUnits(Theta, "m") p <- fit$par[2] w.hi <- fit$w.hi w.lo <- fit$w.lo g.at0 <- w.lo g.atT <- Theta g.atTPlusFuzz <- (((1-p) * Theta) / ((w.hi - Theta) * p))*Theta g.atWhi <- (((1-p) * Theta) / ((w.hi - Theta) * p))*w.hi area.0.to.T <- (Theta - w.lo) * (g.atT - g.at0) / 2 # triangle; Theta^2/2 area.T.to.w <- (w.hi - Theta) * (g.atTPlusFuzz + g.atWhi) / 2 # trapazoid area <- area.0.to.T + area.T.to.w edr <- sqrt( 2*area )
Compute exact integral of the triangle distance function for line transects.
integrateTriangleLines( object, newdata = NULL, w.lo = NULL, w.hi = NULL, Units = NULL )integrateTriangleLines( object, newdata = NULL, w.lo = NULL, w.hi = NULL, Units = NULL )
object |
Either an Rdistance fitted distance function
(an object that inherits from class "dfunc"; usually produced
by a call to |
newdata |
A data frame containing new values for
the distance function covariates. If NULL and
|
w.lo |
Minimum sighting distance or left-truncation value
if |
w.hi |
Maximum sighting distance or right-truncation value
if |
Units |
Physical units of sighting distances if
|
Returned integrals are
where , is the estimated one-step
distance function
threshold for the i-th observed distance, and is the estimated
one-step proportion.
A vector of areas under the distance functions represented in
object.
If object is a distance function and
newdata is specified, the returned vector's length is
nrow(newdata). If object is a distance function and
newdata is NULL,
returned vector's length is length(distances(object)). If
object is a matrix, return's length is
nrow(object).
Users will not normally call this function. It is called
internally by nLL() and effectiveDistance().
integrateNumeric(); integrateNegexpLines();
integrateHalfnormLines()
w <- 250 T <- 160 p <- 0.4 obj <- matrix(c(T,p), 1, 2) integrateTriangleLines(obj , w.lo = units::set_units(0,"m") , w.hi = units::set_units(w,"m") , Units = "m") # same (1 + p) * T / 2 + p * (w - T) # check by numeric integration triLike <- function(d, T, p, wl, wh) { y <- triangle.like(a = c(log(T), p) , dist = d - wl , covars = matrix(1, length(d)) , w.hi = wh - wl)$L.unscaled y } integrate(triLike, lower = 0, upper = w, T = T, p = p, wl = 0, wh = w)w <- 250 T <- 160 p <- 0.4 obj <- matrix(c(T,p), 1, 2) integrateTriangleLines(obj , w.lo = units::set_units(0,"m") , w.hi = units::set_units(w,"m") , Units = "m") # same (1 + p) * T / 2 + p * (w - T) # check by numeric integration triLike <- function(d, T, p, wl, wh) { y <- triangle.like(a = c(log(T), p) , dist = d - wl , covars = matrix(1, length(d)) , w.hi = wh - wl)$L.unscaled y } integrate(triLike, lower = 0, upper = w, T = T, p = p, wl = 0, wh = w)
Utility function to detect whether a distance function has covariates beyond the intercept. If the model contains an intercept-only, effective distance is constant across detections and short-cuts can be implemented in code.
intercept.only(object)intercept.only(object)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
TRUE if object contains an intercept-only.
FALSE if object contains at least one detection-level
or transect-level covariate in the detection function.
Determines whether a distance function is for a point survey or line survey.
is.points(x)is.points(x)
x |
Either an estimated distance function, output by
|
TRUE if the model frame or fitted distance function contains point surveys. FALSE if the model frame or distance function contains line transect surveys.
Checks the validity of Rdistance nested data frames.
Rdistance data frames
are a particular implementation of rowwise tibbles
that contain detections in a list column, and extra attributes
specifying types.
is.RdistDf(df, verbose = FALSE)is.RdistDf(df, verbose = FALSE)
df |
A data frame to check |
verbose |
If TRUE, an explanation of the check that fails is printed. Otherwise, no information on checks is provided. |
The following checks are performed (in this order):
attr(df, "detectionColumn") exists and points to a valid
list-based column in the data frame.
attr(df, "obsType") exists and is one of the valid values.
attr(df, "transType") exists and is one of the valid values.
attr(df, "effortColumn") exists and points to a column in the
data frame with 1 dimensional units attached.
The data frame is either a 'rowwise_df' or 'grouped_df'
tibble.
The data frame has only one row per group. One row per group
is implied by 'rowwise_df', but not a 'grouped_df', and both are allowed
in Rdistance. One row per group ensures rows are uniquely identified
and hence represents one transect.
No column names in the list-column are duplicated in the non-list
columns of the data frame. This check ensures that tidyr::unnest
executes.
Other data checks, e.g., for measurement units, are performed
later in dfuncEstim(), after the model is specified.
TRUE or FALSE invisibly. TRUE means all checks passed. FALSE implies
at least one check failed. Use verbose = TRUE to see which.
data(sparrowDf) is.RdistDf(sparrowDf) # Data frame is okay, but no attributes data(sparrowDetectionData) data(sparrowSiteData) sparrowDf <- sparrowDetectionData |> dplyr::nest_by( siteID , .key = "distances") |> dplyr::right_join(sparrowSiteData, by = "siteID") is.RdistDf(sparrowDf, verbose = TRUE) # Manually add required attributes attr(sparrowDf, "detectionColumn") <- "distances" attr(sparrowDf, "effortColumn") <- "length" attr(sparrowDf, "obsType") <- "single" attr(sparrowDf, "transType" ) <- "line" is.RdistDf(sparrowDf, verbose = TRUE)data(sparrowDf) is.RdistDf(sparrowDf) # Data frame is okay, but no attributes data(sparrowDetectionData) data(sparrowSiteData) sparrowDf <- sparrowDetectionData |> dplyr::nest_by( siteID , .key = "distances") |> dplyr::right_join(sparrowSiteData, by = "siteID") is.RdistDf(sparrowDf, verbose = TRUE) # Manually add required attributes attr(sparrowDf, "detectionColumn") <- "distances" attr(sparrowDf, "effortColumn") <- "length" attr(sparrowDf, "obsType") <- "single" attr(sparrowDf, "transType" ) <- "line" is.RdistDf(sparrowDf, verbose = TRUE)
Determines whether a distance function is a non-parametric smooth or classic parameterized function.
is.smoothed(object)is.smoothed(object)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
TRUE if the model frame or fitted distance function arises from a non-parametric density smoother. FALSE if the model frame or distance function is a parameterized function.
Tests whether a 'units' object is actually
unitless.
Unitless objects, such as ratios, should be assigned
units of [1]. Often they are, but
sometimes unitless ratios are assigned units like [m/m].
The units package should always convert [m/m] to
[1], but it does not always.
Sometimes units like [m/m] mess things up, so it is
better to remove them before calculations.
is.Unitless(obj)is.Unitless(obj)
obj |
A numeric scalar or vector, with or without units. |
TRUE if obj has units and they
are either [1] or the denominator units equal
the numerator units. Otherwise, return FALSE.
If obj does not have units, this routine
returns TRUE.
a <- setUnits(2, "m") b <- a / a is.Unitless(a) is.Unitless(b) is.Unitless(3)a <- setUnits(2, "m") b <- a / a is.Unitless(a) is.Unitless(b) is.Unitless(3)
Returns names of the likelihood parameters. This is a helper function and is not necessary for estimation. It is nice to label some parameters with descriptive names like "sigma" or "k", depending on the likelihood.
likeParamNames(like.form)likeParamNames(like.form)
like.form |
A text string naming the form of the likelihood. An error is thrown if the likelihood is unknown. |
A vector of parameter names for the likelihood
Line plot method for objects of class 'dfunc'
that adds distance functions to an existing plot.
## S3 method for class 'dfunc' lines(x, newdata = NULL, prob = NULL, ...)## S3 method for class 'dfunc' lines(x, newdata = NULL, prob = NULL, ...)
x |
An estimated detection function object, normally
produced by calling |
newdata |
A data frame containing new values of
the covariates at which to evaluate the distance functions.
If |
prob |
Logical scalar for whether to scale the distance function
to be a density function (integrates to one). Default behavior is designed
to be compatible with the plot method for distance functions
( |
... |
Parameters passed to |
A data frame containing the x and y coordinates of the
plotted line(s) is returned invisibly. X coordinates in the
return are names x. Y coordinates in the return are named
y1, y2, ..., yn, i.e., one column per returned
distance function.
dfuncEstim(), plot.dfunc(),
print.abund()
# a simulated RdistDf set.seed(87654) x <- rnorm(1000, mean=0, sd=20) x <- x[x >= 0] x <- setUnits(x, "mi") Df <- data.frame(transectID = "A" , distance = x ) |> dplyr::nest_by( transectID , .key = "detections") |> dplyr::mutate(length = setUnits(100,"km")) attr(Df, "detectionColumn") <- "detections" attr(Df, "obsType") <- "single" attr(Df, "transType") <- "line" attr(Df,'effortColumn') <- "length" is.RdistDf(Df) # TRUE dfunc <- Df |> dfuncEstim(distance ~ 1, likelihood="halfnorm") plot(dfunc, nbins = 40, col="lightgrey", border=NA, vertLines=FALSE) lines(dfunc, col="grey30", lwd=15) lines(dfunc, col="grey90", lwd=5, lty = 2) # Multiple lines data(sparrowDfuncObserver) obsLevs <- levels(sparrowDfuncObserver$data$observer) plot(sparrowDfuncObserver , vertLines = FALSE , lty = 0 , plotBars = FALSE , main="Detection by observer" , legend = FALSE) y <- lines(sparrowDfuncObserver , newdata = data.frame(observer = obsLevs) , col = palette.colors(length(obsLevs)) , lty = 1 , lwd = 4) head(y) # values returned, with distances as column# a simulated RdistDf set.seed(87654) x <- rnorm(1000, mean=0, sd=20) x <- x[x >= 0] x <- setUnits(x, "mi") Df <- data.frame(transectID = "A" , distance = x ) |> dplyr::nest_by( transectID , .key = "detections") |> dplyr::mutate(length = setUnits(100,"km")) attr(Df, "detectionColumn") <- "detections" attr(Df, "obsType") <- "single" attr(Df, "transType") <- "line" attr(Df,'effortColumn') <- "length" is.RdistDf(Df) # TRUE dfunc <- Df |> dfuncEstim(distance ~ 1, likelihood="halfnorm") plot(dfunc, nbins = 40, col="lightgrey", border=NA, vertLines=FALSE) lines(dfunc, col="grey30", lwd=15) lines(dfunc, col="grey90", lwd=5, lty = 2) # Multiple lines data(sparrowDfuncObserver) obsLevs <- levels(sparrowDfuncObserver$data$observer) plot(sparrowDfuncObserver , vertLines = FALSE , lty = 0 , plotBars = FALSE , main="Detection by observer" , legend = FALSE) y <- lines(sparrowDfuncObserver , newdata = data.frame(observer = obsLevs) , col = palette.colors(length(obsLevs)) , lty = 1 , lwd = 4) head(y) # values returned, with distances as column
Find the x coordinate that maximizes g(x).
maximize.g(fit, covars = NULL)maximize.g(fit, covars = NULL)
fit |
An estimated 'dfunc' object produced by |
covars |
Covariate values to calculate g(x). |
The value of x that maximizes g(x) in fit.
## Not run: # Fake data set.seed(22223333) x <- rgamma(100, 10, 1) fit <- dfuncEstim( x, likelihood="Gamma", x.scl="max" ) maximize.g( fit ) # should be near 10. fit$x.scl # same thing ## End(Not run)## Not run: # Fake data set.seed(22223333) x <- rgamma(100, 10, 1) fit <- dfuncEstim( x, likelihood="Gamma", x.scl="max" ) maximize.g( fit ) # should be near 10. fit$x.scl # same thing ## End(Not run)
Estimate parameters of a distance function using maximum likelihood.
mlEstimates(ml, strt.lims)mlEstimates(ml, strt.lims)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
strt.lims |
A list containing start, low, and high limits for
parameters of the requested likelihood. This list is typically produced
by a call to |
An Rdistance fitted model object. This object contains the
raw object returned by the optimization routine (e.g., nlming),
and additional components specific to Rdistance.
Extract the model matrix ("X" matrix) from an Rdistance model object.
## S3 method for class 'dfunc' model.matrix(object, ...)## S3 method for class 'dfunc' model.matrix(object, ...)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
... |
Ignored |
A matrix containing covariates for fitting an Rdistance model.
data(sparrowDf) sparrowModel <- parseModel( sparrowDf, dist ~ observer ) model.matrix(sparrowModel)data(sparrowDf) sparrowModel <- parseModel( sparrowDf, dist ~ observer ) model.matrix(sparrowModel)
Return number of covariates in a distance model
nCovars(X)nCovars(X)
X |
The X matrix of covariates, or a vector. |
The reason this routine is needed is that sometimes we pass one row of covariates to a likelihood function. If so, it may come in as a normal vector, not a matrix. If a normal vector, ncol(X) does not work.
An integer scalar
Computes the negative exponential distance function.
negexp.like(a, dist, covars, w.hi = NULL)negexp.like(a, dist, covars, w.hi = NULL)
a |
A vector or matrix of covariate
and expansion term
coefficients. If matrix, dimension is
k X p, where
k = |
dist |
A numeric vector of length n or a single-column matrix (dimension nX1) containing detection distances at which to evaluate the likelihood. |
covars |
A numeric vector of length q or a
matrix of dimension nXq containing
covariate values
associated with distances in argument |
w.hi |
A numeric scalar containing maximum distance. The right-hand cutoff or upper limit. Ignored by some likelihoods (such as halfnorm, negexp, and hazrate), but is a fixed parameter in other likelihoods (such as oneStep and uniform). |
The negative exponential likelihood is
where is the
slope parameter.
A list containing the following two components:
L.unscaled: A matrix of size
nXk (n = length dist; k = number of
cases = nrow(a))
containing likelihood values evaluated at
distances in dist.
Each row is associated with
a single distance, and each column is associated with
a single case (row of a). Values in this matrix are
the distance function which generally have .
These values are "unscaled" likelihood values; they must be
scaled (divided by) with the area under between w.lo and w.hi
to form proper likelihood values.
params: A nXbXk array
of the likelihood's (canonical) parameters in link space (i.e., on
log scale; b = number of canonical parameters in
the likelihood; k = number of cases).
Rows correspond to distances in dist. Columns
correspond to parameters (columns of a),
and pages correspond to cases (rows of a).
dfuncEstim(),
abundEstim(),
other <likelihood>.like functions
d <- seq(0, 100, length=100) covs <- matrix(1,length(d),1) negexp.like(log(0.01), d, covs) # Changing slope parameter plot(d, negexp.like(log(0.1), d, covs)$L.unscaled, type="l", col="red") lines(d, negexp.like(log(0.05), d, covs)$L.unscaled, col="blue")d <- seq(0, 100, length=100) covs <- matrix(1,length(d),1) negexp.like(log(0.01), d, covs) # Changing slope parameter plot(d, negexp.like(log(0.1), d, covs)$L.unscaled, type="l", col="red") lines(d, negexp.like(log(0.05), d, covs)$L.unscaled, col="blue")
Compute starting values and limits for the negative exponential distance function.
negexp.start.limits(ml)negexp.start.limits(ml)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
A list containing the following components
start |
Vector of starting values for parameters of the likelihood and expansion terms. |
lowlimit |
Vector of lower limits for the likelihood parameters and expansion terms. |
uplimit |
Vector of upper limits for the likelihood parameters and expansion terms. |
names |
Vector of names for the likelihood parameters and expansion terms. |
The length of each vector in the return is:
(Num expansions) + 1 + 1*(like %in% c("hazrate")) + (Num Covars).
Return the negative log likelihood of observed detection distances given a likelihood and the estimated parameters.
nLL(a, ml, verbosity = 0)nLL(a, ml, verbosity = 0)
a |
A vector of likelihood parameter values. Length and
meaning depend on |
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
verbosity |
The level of output produced during estimation.
|
Expansion Terms: If ml$expansions = k (k > 0),
the expansion function specified by ml$series is
called (see for example cosine.expansion()).
Assuming is the
expansion term for the
distance and that
are (estimated) coefficients for the expansion terms,
the likelihood contribution for the
distance is,
A scalar, the negative of the log likelihood evaluated at
parameters a.
See halfnorm.like() and links there;
dfuncEstim()
# A halfnorm distance function fit <- dfuncEstim(sparrowDf, dist~1, likelihood = "halfnorm") nLL(fit$par, fit, 3) fit$loglik ESW(fit)[1] # Another way, b/c we have pnorm() d <- distances(fit) ones <- matrix(1, nrow = length(d), ncol = 1) l <- halfnorm.like(fit$par, d, ones) esw <-(pnorm(units::drop_units(fit$w.hi) , units::drop_units(fit$w.lo) , sd = exp(l$params)) - 0.5) * sqrt(2*pi) * exp(l$params) -sum(log(l$L.unscaled/esw)) # A third way, b/c we have pnorm() and dnorm(). l2 <- dnorm(units::drop_units(d), mean = 0, sd = exp(fit$par)) scaler <- pnorm(units::drop_units(fit$w.hi), mean = 0, sd = exp(fit$par)) - 0.5 -sum(log(l2/scaler))# A halfnorm distance function fit <- dfuncEstim(sparrowDf, dist~1, likelihood = "halfnorm") nLL(fit$par, fit, 3) fit$loglik ESW(fit)[1] # Another way, b/c we have pnorm() d <- distances(fit) ones <- matrix(1, nrow = length(d), ncol = 1) l <- halfnorm.like(fit$par, d, ones) esw <-(pnorm(units::drop_units(fit$w.hi) , units::drop_units(fit$w.lo) , sd = exp(l$params)) - 0.5) * sqrt(2*pi) * exp(l$params) -sum(log(l$L.unscaled/esw)) # A third way, b/c we have pnorm() and dnorm(). l2 <- dnorm(units::drop_units(d), mean = 0, sd = exp(fit$par)) scaler <- pnorm(units::drop_units(fit$w.hi), mean = 0, sd = exp(fit$par)) - 0.5 -sum(log(l2/scaler))
Call R native function 'nlminb' to perform optimization.
Nlminb(ml, strt.lims)Nlminb(ml, strt.lims)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
strt.lims |
A list containing start, low, and high limits for
parameters of the requested likelihood. This list is typically produced
by a call to |
A list with following named components:
par = parameters
loglik = objective function value at minimum
convergence = 0 for yes, other for no
iterations = number of iterations
evaluations = function evaluations
message = a convergence message
varcovar = a variance covariance matrix of parameters
limits = low and high limits
Return the type of observations (single or multiple observers) represented in either a fitted distance function or Rdistance data frame.
observationType(x)observationType(x)
x |
Either an estimated distance function, output by
|
This function is a simple helper function. If x is an
estimated distance object, it polls the obsType
attribute of the object's Rdistance data frame. If x is an Rdistance nested data frame, it
polls the obsType attribute.
One of the following values: "single", "1given2", "2given1", or "both". If observation type has not been assigned, return is NULL.
Performs density and abundance estimation for one bootstrap iteration.
oneBsIter( object, area, propUnitSurveyed, pb, plot.bs, plotCovValues, warn = FALSE, asymptoticSE = FALSE )oneBsIter( object, area, propUnitSurveyed, pb, plot.bs, plotCovValues, warn = FALSE, asymptoticSE = FALSE )
object |
A fitted Rdistance distance function (class |
area |
A scalar containing the total area of inference. Usually, this is
study area size. If |
propUnitSurveyed |
A scalar or vector of real numbers between 0 and 1.
The proportion of the default sampling unit that
was surveyed. If both sides of line transects were observed,
|
pb |
A progress bar created with |
plot.bs |
Logical. Whether to plot bootstrap estimate of detection function.
A plot must already exist because this uses |
plotCovValues |
Data frame containing values of covariates to plot.
Ignored if |
warn |
A logical scalar specifying whether to issue
an R warning if the estimation did not converge or if one
or more parameter estimates are at their boundaries.
For estimation, |
asymptoticSE |
Logical variable for whether to calculate
asymptotic standard errors. The default (TRUE) estimates an
asymptotic variance-covariance matrix for parameters based on the
likelihood's Hessian (2nd derivative). If maximization
has been performed by Nlminb or HookesJeeves, the asymptotic
Hessian is estimated using numeric second derivatives
of the likelihood at the maximum likelihood solution. If
maximization was performed by Optim, the last Hessian of
the maximization is returned
by Optim and used
(see |
A data frame containing density and abundance and other relevant statistics for one iteration of the bootstrap.
Compute likelihood function for a mixture of two uniform distributions.
oneStep.like(a, dist, covars, w.hi = NULL)oneStep.like(a, dist, covars, w.hi = NULL)
a |
A vector or matrix of covariate
and expansion term
coefficients. If matrix, dimension is
k X p, where
k = |
dist |
A numeric vector of length n or a single-column matrix (dimension nX1) containing detection distances at which to evaluate the likelihood. |
covars |
A numeric vector of length q or a
matrix of dimension nXq containing
covariate values
associated with distances in argument |
w.hi |
A numeric scalar containing maximum distance. The right-hand cutoff or upper limit. Ignored by some likelihoods (such as halfnorm, negexp, and hazrate), but is a fixed parameter in other likelihoods (such as oneStep and uniform). |
Rdistance's oneStep likelihood is a mixture of two
non-overlapping uniform distributions. The 'oneStep' density function
is
where is the indicator function for event ,
and is the nominal strip width (i.e., w.hi - w.lo in Rdistance).
The unknown parameters to be estimated
are and
( is fixed - given by the user).
Covariates influence values of
via a log link function, i.e., ,
where is the vector of covariate values
associated with distance , and
is the vector of estimated coefficients.
A list containing the following two components:
L.unscaled: A matrix of size
nXk (n = length dist; k = number of
cases = nrow(a))
containing likelihood values evaluated at
distances in dist.
Each row is associated with
a single distance, and each column is associated with
a single case (row of a). Values in this matrix are
the distance function which generally have .
These values are "unscaled" likelihood values; they must be
scaled (divided by) with the area under between w.lo and w.hi
to form proper likelihood values.
params: A nXbXk array
of the likelihood's (canonical) parameters in link space (i.e., on
log scale; b = number of canonical parameters in
the likelihood; k = number of cases).
Rows correspond to distances in dist. Columns
correspond to parameters (columns of a),
and pages correspond to cases (rows of a).
Peter F. Craigmile & D.M. Tirrerington (1997) "Parameter estimation for finite mixtures of uniform distributions", Communications in Statistics - Theory and Methods, 26:8, 1981-1995, DOI: 10.1080/03610929708832026
A. Hussein & J. Liu (2009) "Parametric estimation of mixtures of two uniform distributions", Journal of Statistical Computation and Simulation, 79:4, 395-410, DOI:10.1080/00949650701810406
dfuncEstim(),
abundEstim(),
other <likelihood>.like functions
# Fit oneStep to simulated data whi <- 250 T <- 100 # true threshold p <- 0.85 n <- 200 x <- c( runif(n*p, min=0, max=T), runif(n*(1-p), min=T, max=whi)) x <- setUnits(x, "m") tranID <- sample(rep(1:10, each=n/10), replace=FALSE) detectDf <- data.frame(transect = tranID, dist = x) siteDf <- data.frame(transect = 1:10 , length = rep(setUnits(10,"m"), 10)) distDf <- RdistDf(siteDf, detectDf) # Estimation fit <- dfuncEstim(distDf , formula = dist ~ 1 , likelihood = "oneStep" , w.hi = setUnits(whi, "m") ) plot(fit) thetaHat <- exp(coef(fit)[1]) pHat <- coef(fit)[2] c(thetaHat, pHat) # should be close to c(100,0.85) summary(abundEstim(fit, ci=NULL))# Fit oneStep to simulated data whi <- 250 T <- 100 # true threshold p <- 0.85 n <- 200 x <- c( runif(n*p, min=0, max=T), runif(n*(1-p), min=T, max=whi)) x <- setUnits(x, "m") tranID <- sample(rep(1:10, each=n/10), replace=FALSE) detectDf <- data.frame(transect = tranID, dist = x) siteDf <- data.frame(transect = 1:10 , length = rep(setUnits(10,"m"), 10)) distDf <- RdistDf(siteDf, detectDf) # Estimation fit <- dfuncEstim(distDf , formula = dist ~ 1 , likelihood = "oneStep" , w.hi = setUnits(whi, "m") ) plot(fit) thetaHat <- exp(coef(fit)[1]) pHat <- coef(fit)[2] c(thetaHat, pHat) # should be close to c(100,0.85) summary(abundEstim(fit, ci=NULL))
Compute starting values and limits for the oneStep distance function.
oneStep.start.limits(ml)oneStep.start.limits(ml)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
A list containing the following components
start |
Vector of starting values for parameters of the likelihood and expansion terms. |
lowlimit |
Vector of lower limits for the likelihood parameters and expansion terms. |
uplimit |
Vector of upper limits for the likelihood parameters and expansion terms. |
names |
Vector of names for the likelihood parameters and expansion terms. |
The length of each vector in the return is:
(Num expansions) + 1 + 1*(like %in% c("hazrate")) + (Num Covars).
# make 'model list' object # Boundary is 10, p is 100 / 120 = 0.833 library(Rdistance) whi <- 50 x <- c( runif(100, min=0, max=10), runif(20, min=10, max=whi)) x <- setUnits(x, "m") detectDf <- data.frame(transect = 1, dist = x) siteDf <- data.frame(transect = 1, length = setUnits(10,"m")) distDf <- RdistDf(siteDf, detectDf) ml <- parseModel(distDf , formula = dist ~ 1 , w.lo = 0 , w.hi = setUnits(whi, "m") ) sl <- oneStep.start.limits(ml) hist(x, n = 20) abline(v = exp(sl$start["(Intercept)"]))# make 'model list' object # Boundary is 10, p is 100 / 120 = 0.833 library(Rdistance) whi <- 50 x <- c( runif(100, min=0, max=10), runif(20, min=10, max=whi)) x <- setUnits(x, "m") detectDf <- data.frame(transect = 1, dist = x) siteDf <- data.frame(transect = 1, length = setUnits(10,"m")) distDf <- RdistDf(siteDf, detectDf) ml <- parseModel(distDf , formula = dist ~ 1 , w.lo = 0 , w.hi = setUnits(whi, "m") ) sl <- oneStep.start.limits(ml) hist(x, n = 20) abline(v = exp(sl$start["(Intercept)"]))
Call R native function 'optim' to perform optimization.
Optim(ml, strt.lims)Optim(ml, strt.lims)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
strt.lims |
A list containing start, low, and high limits for
parameters of the requested likelihood. This list is typically produced
by a call to |
A list with following named components:
par = parameters
loglik = objective function value at minimum
convergence = 0 for yes, other for no
iterations = number of iterations
evaluations = function evaluations
message = a convergence message
varcovar = a variance covariance matrix of parameters
limits = low and high limits
Parse an 'Rdistance' formula and produce a list containing all model parameters. This routine is not normally called directly by the user, but it might be helpful in simulations. It is called internally from the model estimation routines.
parseModel( data, formula = NULL, likelihood = "halfnorm", w.lo = 0, w.hi = NULL, expansions = 0, series = "cosine", x.scl = w.lo, g.x.scl = 1, outputUnits = NULL, asymptoticSE = TRUE )parseModel( data, formula = NULL, likelihood = "halfnorm", w.lo = 0, w.hi = NULL, expansions = 0, series = "cosine", x.scl = w.lo, g.x.scl = 1, outputUnits = NULL, asymptoticSE = TRUE )
data |
An |
formula |
A standard formula object. For example, |
likelihood |
String specifying the likelihood to fit. Built-in likelihoods at present are "halfnorm", "hazrate", and "negexp". |
w.lo |
Lower or left-truncation limit of the distances in distance data.
This is the minimum possible off-transect distance. Default is 0. If
|
w.hi |
Upper or right-truncation limit of the distances
in |
expansions |
A scalar specifying the number of terms
in |
series |
If |
x.scl |
The x coordinate (a distance) at which the
detection function will be scaled. |
g.x.scl |
Height of the distance function at coordinate |
outputUnits |
A string specifying the symbolic measurement
units for results. Valid units are listed in |
asymptoticSE |
Logical variable for whether to calculate
asymptotic standard errors. The default (TRUE) estimates an
asymptotic variance-covariance matrix for parameters based on the
likelihood's Hessian (2nd derivative). If maximization
has been performed by Nlminb or HookesJeeves, the asymptotic
Hessian is estimated using numeric second derivatives
of the likelihood at the maximum likelihood solution. If
maximization was performed by Optim, the last Hessian of
the maximization is returned
by Optim and used
(see |
An Rdistance model frame, which is an object of class "dfunc". Rdistance model frames are lists containing distance model components but not estimates. Model frames contain everything necessary to fit an Rdistance mode, such as covariates, minimum and maximum distances, the form of the likelihood, number of expansions, etc. Rdistance model frames contain a subset of fitted Rdistance model components.
RdistDf(), which returns an
Rdistance data frame;
dfuncEstim(), which returns an
Rdistance fitted model.
data(sparrowDf) ml <- Rdistance::parseModel(sparrowDf , formula = dist ~ 1 + observer + groupsize(groupsize) , likelihood = "halfnorm" , w.lo = 0 , w.hi = NULL , series = "cosine" , x.scl = 0 , g.x.scl = 1 , outputUnits = "m" ) class(ml) # 'dfunc', but no estimated coefficients print(ml) print.default(ml)data(sparrowDf) ml <- Rdistance::parseModel(sparrowDf , formula = dist ~ 1 + observer + groupsize(groupsize) , likelihood = "halfnorm" , w.lo = 0 , w.hi = NULL , series = "cosine" , x.scl = 0 , g.x.scl = 1 , outputUnits = "m" ) class(ml) # 'dfunc', but no estimated coefficients print(ml) print.default(ml)
Computes off-transect (also called 'perpendicular') distances from measures of sighting distance and sighting angle.
perpDists(sightDist, sightAngle, data)perpDists(sightDist, sightAngle, data)
sightDist |
Character, name of column in |
sightAngle |
Character, name of column in |
data |
data.frame object containing sighting distance and sighting angle. |
If observers recorded sighting distance and sighting angle (as is often common in line transect surveys), use this function to convert
to off-transect distances, the required input data for dfunc.estim.
A vector of off-transect (or perpendicular) distances. Units are the same as sightDist.
Buckland, S.T., Anderson, D.R., Burnham, K.P. and Laake, J.L. 1993. Distance Sampling: Estimating Abundance of Biological Populations. Chapman and Hall, London.
# Load the example dataset of sparrow detections from package data(sparrowDetectionData) # Compute perpendicular, off-transect distances from the observer's sight distance and angle sparrowDetectionData$perpDist <- perpDists(sightDist="sightdist", sightAngle="sightangle", data=sparrowDetectionData)# Load the example dataset of sparrow detections from package data(sparrowDetectionData) # Compute perpendicular, off-transect distances from the observer's sight distance and angle sparrowDetectionData$perpDist <- perpDists(sightDist="sightdist", sightAngle="sightangle", data=sparrowDetectionData)
Plot method for objects of class 'dfunc'. Objects of
class 'dfunc' are estimated distance functions produced by
dfuncEstim().
## S3 method for class 'dfunc' plot(x, ...)## S3 method for class 'dfunc' plot(x, ...)
x |
An estimated detection function object, normally
produced by calling |
... |
Arguments passed on to
|
If plotBars is TRUE, a scaled histogram is plotted
and the estimated distance function
is plotted over the top of it. When bars are plotted,
this routine uses graphics::barplot
for setting up the initial plotting region and
most parameters to graphics::barplot can
be specified (exceptions noted above in description of '...').
The form of the likelihood and any series
expansions is printed in the main title (overwrite this with
main="<my title>"). Convergence of the distance
function is checked. If the distance function did not converge, a warning
is printed over the top of the histogram. If one or more parameter
estimates are at their limits (likely indicating non-convergence or poor
fit), another warning is printed.
The input distance function is returned, with two additional
components than can be used to reconstruct the plotted bars. (To
obtain values of the plotted distance functions, use predict
with type = "distances".)
The additional components are:
barHeights |
A vector containing the scaled bar heights drawn on the plot. |
barWidths |
A vector or scalar of the bar widths drawn on the plot, with measurement units. |
Re-plot the bars with barplot( return$barHeights, width = return$barWidths ).
dfuncEstim(), print.dfunc(),
print.abund()
# Simulated RdistDf set.seed(87654) x <- rnorm(1000, mean=0, sd=20) x <- x[x >= 0] x <- setUnits(x, "ft") Df <- data.frame(transectID = "A" , distance = x ) |> dplyr::nest_by( transectID , .key = "detections") |> dplyr::mutate(length = setUnits(1,"mi")) attr(Df, "detectionColumn") <- "detections" attr(Df, "obsType") <- "single" attr(Df, "transType") <- "line" attr(Df, "effortColumn") <- "length" is.RdistDf(Df) # TRUE dfunc <- Df |> dfuncEstim(distance ~ 1, likelihood="halfnorm") plot(dfunc) plot(dfunc, nbins=25) # showing effects of plot parameters plot(dfunc , col=c("red","blue","orange") , border="black" , xlab="Off-transect distance" , ylab="Prob" , vertLines = FALSE , main="Showing plot params") plot(dfunc , col="purple" , density=30 , angle=c(-45,0,45) , cex.axis=1.5 , cex.lab=2 , ylab="Probability") plot(dfunc , col=c("grey","lightgrey") , border=NA) plot(dfunc , col="grey" , border=0 , col.dfunc="blue" , lty.dfunc=2 , lwd.dfunc=4 , vertLines=FALSE) plot(dfunc , plotBars=FALSE , cex.axis=1.5 , col.axis="blue") rug(distances(dfunc)) # un-equal bin widths, nbins must span distances plot(dfunc , nbins = c(0,2.5,5,7.5,10,15,25,50,70) ) # Plot showing f(0) hist(distances(dfunc) , n = 40 , border = NA , prob = TRUE) x <- seq(dfunc$w.lo, dfunc$w.hi, length=200) g <- predict(dfunc, type="dfunc", distances = x, newdata = data.frame(a=1)) f <- g[,1] / ESW(dfunc)[1] # Check integration: sum(diff(x)*(f[-1] + f[-length(f)]) / 2) # Trapazoid rule; should be 1.0 lines(x, f) # hence, 1/f(0) = ESW # Covariates: detection by observer data(sparrowDfuncObserver) # pre-estimated model obsLevs <- levels(sparrowDfuncObserver$data$observer) plot(sparrowDfuncObserver , newdata = data.frame(observer = obsLevs) , vertLines = FALSE , col.dfunc = heat.colors(length(obsLevs)) , col = c("grey","lightgrey") , border=NA , main="Detection by observer")# Simulated RdistDf set.seed(87654) x <- rnorm(1000, mean=0, sd=20) x <- x[x >= 0] x <- setUnits(x, "ft") Df <- data.frame(transectID = "A" , distance = x ) |> dplyr::nest_by( transectID , .key = "detections") |> dplyr::mutate(length = setUnits(1,"mi")) attr(Df, "detectionColumn") <- "detections" attr(Df, "obsType") <- "single" attr(Df, "transType") <- "line" attr(Df, "effortColumn") <- "length" is.RdistDf(Df) # TRUE dfunc <- Df |> dfuncEstim(distance ~ 1, likelihood="halfnorm") plot(dfunc) plot(dfunc, nbins=25) # showing effects of plot parameters plot(dfunc , col=c("red","blue","orange") , border="black" , xlab="Off-transect distance" , ylab="Prob" , vertLines = FALSE , main="Showing plot params") plot(dfunc , col="purple" , density=30 , angle=c(-45,0,45) , cex.axis=1.5 , cex.lab=2 , ylab="Probability") plot(dfunc , col=c("grey","lightgrey") , border=NA) plot(dfunc , col="grey" , border=0 , col.dfunc="blue" , lty.dfunc=2 , lwd.dfunc=4 , vertLines=FALSE) plot(dfunc , plotBars=FALSE , cex.axis=1.5 , col.axis="blue") rug(distances(dfunc)) # un-equal bin widths, nbins must span distances plot(dfunc , nbins = c(0,2.5,5,7.5,10,15,25,50,70) ) # Plot showing f(0) hist(distances(dfunc) , n = 40 , border = NA , prob = TRUE) x <- seq(dfunc$w.lo, dfunc$w.hi, length=200) g <- predict(dfunc, type="dfunc", distances = x, newdata = data.frame(a=1)) f <- g[,1] / ESW(dfunc)[1] # Check integration: sum(diff(x)*(f[-1] + f[-length(f)]) / 2) # Trapazoid rule; should be 1.0 lines(x, f) # hence, 1/f(0) = ESW # Covariates: detection by observer data(sparrowDfuncObserver) # pre-estimated model obsLevs <- levels(sparrowDfuncObserver$data$observer) plot(sparrowDfuncObserver , newdata = data.frame(observer = obsLevs) , vertLines = FALSE , col.dfunc = heat.colors(length(obsLevs)) , col = c("grey","lightgrey") , border=NA , main="Detection by observer")
Plot method for parametric line and point transect distance functions.
## S3 method for class 'dfunc.para' plot( x, include.zero = FALSE, nbins = "Sturges", newdata = NULL, legend = TRUE, vertLines = TRUE, plotBars = TRUE, circles = FALSE, density = -1, angle = 45, xlab = NULL, ylab = NULL, border = TRUE, col = "grey85", col.dfunc = NULL, lty.dfunc = NULL, lwd.dfunc = NULL, ... )## S3 method for class 'dfunc.para' plot( x, include.zero = FALSE, nbins = "Sturges", newdata = NULL, legend = TRUE, vertLines = TRUE, plotBars = TRUE, circles = FALSE, density = -1, angle = 45, xlab = NULL, ylab = NULL, border = TRUE, col = "grey85", col.dfunc = NULL, lty.dfunc = NULL, lwd.dfunc = NULL, ... )
x |
An estimated detection function object, normally
produced by calling |
include.zero |
Boolean value specifying whether to include 0 on the x-axis
of the plot. A value of TRUE will include 0 on the left hand end of the x-axis
regardless of the range of distances. A value of FALSE will plot only the
observation strip ( |
nbins |
Internally, this function uses |
newdata |
Data frame (similar to |
legend |
Logical scalar for whether to include a legend. If TRUE, a legend will be included on the plot detailing the covariate values used to generate the plot. |
vertLines |
Logical scalar specifying whether to plot vertical
lines at |
plotBars |
Logical scalar for whether to plot the histogram of distances behind the distance function. If FALSE, no histogram is plotted, only the distance function line(s). |
circles |
Logical scalar requesting the location of detection distances be plotted. If TRUE, open circles are plotted at predicted distance function heights associated with all detection distances. For computational simplicity, all distances are plotted for EVERY covariate class even though observed distances belong to only one covariate class. If FALSE, circles are not plotted. |
density |
If |
angle |
When |
xlab |
Label for the x-axis |
ylab |
Label for the y-axis |
border |
The color of bar borders when bars are plotted,
repeated as necessary to exceed the number of bars. A
value of NA or FALSE means no borders. If bars are shaded with lines
(i.e., |
col |
A vector of bar fill colors or line colors when bars are
drawn and |
col.dfunc |
Color of the distance function(s).
If only one distance function (one line) is being plotted,
the default color is "red".
If covariates or |
lty.dfunc |
Line type of the distance function(s).
If covariates or |
lwd.dfunc |
Line width of the distance function(s), replicated to the required length. Default is 2 for all lines. |
... |
When bars are plotted, this routine
uses |
The input distance function is returned, with two additional
components than can be used to reconstruct the plotted bars. (To
obtain values of the plotted distance functions, use predict
with type = "distances".)
The additional components are:
barHeights |
A vector containing the scaled bar heights drawn on the plot. |
barWidths |
A vector or scalar of the bar widths drawn on the plot, with measurement units. |
Re-plot the bars with barplot( return$barHeights, width = return$barWidths ).
# Simulated data set.seed(87654) x <- rnorm(1000, mean=0, sd=20) x <- x[x >= 0] x <- setUnits(x, "ft") Df <- data.frame(transectID = "A" , distance = x ) |> dplyr::nest_by( transectID , .key = "detections") |> dplyr::mutate(length = setUnits(1,"mi")) attr(Df, "detectionColumn") <- "detections" attr(Df, "obsType") <- "single" attr(Df, "transType") <- "line" attr(Df, "effortColumn") <- "length" is.RdistDf(Df) # TRUE # Estimation dfunc <- dfuncEstim(Df , formula = distance~1 , likelihood="halfnorm") plot(dfunc) plot(dfunc, nbins=25)# Simulated data set.seed(87654) x <- rnorm(1000, mean=0, sd=20) x <- x[x >= 0] x <- setUnits(x, "ft") Df <- data.frame(transectID = "A" , distance = x ) |> dplyr::nest_by( transectID , .key = "detections") |> dplyr::mutate(length = setUnits(1,"mi")) attr(Df, "detectionColumn") <- "detections" attr(Df, "obsType") <- "single" attr(Df, "transType") <- "line" attr(Df, "effortColumn") <- "length" is.RdistDf(Df) # TRUE # Estimation dfunc <- dfuncEstim(Df , formula = distance~1 , likelihood="halfnorm") plot(dfunc) plot(dfunc, nbins=25)
An internal prediction method for computing density on the sampled transects.
predDensity(object, propUnitSurveyed = 1)predDensity(object, propUnitSurveyed = 1)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
propUnitSurveyed |
A scalar or vector of real numbers between 0 and 1.
The proportion of the default sampling unit that
was surveyed. If both sides of line transects were observed,
|
A data frame containing the original data used to fit the distance function, plus an additional column containing the density of individuals on each transect.
data(sparrowDfuncObserver) predict(sparrowDfuncObserver, type="density")data(sparrowDfuncObserver) predict(sparrowDfuncObserver, type="density")
An internal prediction function to predict a distance function. This version allows for matrix inputs and uses matrix operations, and is thus faster than earlier looping versions.
predDfuncs(object, params, distances, isSmooth)predDfuncs(object, params, distances, isSmooth)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
params |
A matrix of distance function parameters. Rows are observations, columns contain the set of parameters (canonical and expansion) for each observation. |
distances |
A vector or 1-column matrix of
distances at which to evaluate
distance functions, when distance functions
are requested. |
isSmooth |
Logical. TRUE if the distance function is smoothed (and hence has no parameters). |
A matrix of distance function values, of size length(distances) X nrow(params). Each row of params is associated with a column, i.e., a different distance function. Distances are associated with rows, i.e., use matplot(d,return) to plot values on separate distance functions specified by rows of params.
Predict either likelihood parameters, distance functions, site-specific density, or site-specific abundance from estimated distance function objects.
## S3 method for class 'dfunc' predict( object, newdata = NULL, type = c("parameters"), distances = NULL, propUnitSurveyed = 1, area = NULL, ... )## S3 method for class 'dfunc' predict( object, newdata = NULL, type = c("parameters"), distances = NULL, propUnitSurveyed = 1, area = NULL, ... )
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
newdata |
A data frame containing new values of
the covariates at which to evaluate the distance functions.
If |
type |
The type of predictions desired.
If |
distances |
A vector or 1-column matrix of
distances at which to evaluate
distance functions, when distance functions
are requested. |
propUnitSurveyed |
A scalar or vector of real numbers between 0 and 1.
The proportion of the default sampling unit that
was surveyed. If both sides of line transects were observed,
|
area |
A scalar containing the total area of inference. Usually, this is
study area size. If |
... |
Included for compatibility with generic |
A matrix containing predictions:
If type is "parameters", the returned matrix
contains likelihood parameters. The extent of the
first dimension (rows) in the returned matrix is equal to
either the number of detection distances
in the observed strip or number of rows in newdata.
The returned matrix's second dimension (columns) is
the number of parameters in the likelihood
plus the number of expansion terms.
See the help for each likelihoods to interpret
returned parameter values. All parameters are returned
on the inverse-link scale; i.e., exponential for canonical
parameters and identity for expansion terms.
If type is "dfuncs" or "dfunc", columns of the
returned matrix contains detection functions (i.e., g(x)).
The extent of the first
dimension (number of rows) is either the number of distances
specified in distances
or options()$Rdistance_intEvalPts if distances is
not specified.
The extent of the second dimension (number of columns) is:
the number of detections with non-missing distances:
if newdata is NULL.
the number of rows in newdata if
newdata is specified.
All distance functions in columns of the return are scaled
to object$g.x.scale at object$x.scl. The returned matrix has
the following additional attributes:
attr(return, "distances") is the vector of distances used to
predict the function in return. Either the input distances object
or the computed sequence of distances when distances is NULL.
attr(return, "x0") is the vector of distances at which each
distance function in return was scaled. i.e., the vector of
x.scl.
attr(return, "g.x.scl") is the height of g(x) (the distance
function) at x0.
If type is "density" or "abundance", the return is a
tibble containing density and abundance estimates by transect.
All transects in the input data (i.e., object$data) are
included, even those with missing lengths.
Columns in the tibble are:
transect ID: the grouping factor of the original RdistDf object.
individualsSeen: sum of non-missing group sizes on that transect.
avgPdetect: average probability of detection over groups sighted on that transect.
effort: size of the area surveyed by that transect.
density: density of individuals in the area surveyed by the transect.
abundance: abundance of individuals in the area surveyed by the transect.
halfnorm.like(), negexp.like(),
hazrate.like()
data("sparrowDf") # For dimension checks: nd <- getOption("Rdistance_intEvalPts") # No covariates dfuncObs <- sparrowDf |> dfuncEstim(formula = dist ~ 1 , w.hi = units::as_units(100, "m")) n <- nrow(dfuncObs$mf) p <- predict(dfuncObs) # parameters all(dim(p) == c(n, 1)) # values in newdata ignored because no covariates p <- predict(dfuncObs, newdata = data.frame(x = 1:5)) all(dim(p) == c(5, 1)) # Distance functions in columns, one per observation p <- predict(dfuncObs, type = "dfunc") all(dim(p) == c(nd, n)) d <- setUnits(c(0, 20, 40), "ft") p <- predict(dfuncObs, distances = d, type = "dfunc") all(dim(p) == c(3, n)) p <- predict(dfuncObs , newdata = data.frame(x = 1:5) , distances = d , type = "dfunc") all(dim(p) == c(3, 5)) # Covariates data(sparrowDfuncObserver) # pre-estimated object ## Not run: # Command to generate 'sparrowDfuncObserver' sparrowDfuncObserver <- sparrowDf |> dfuncEstim(formula = dist ~ observer , likelihood = "hazrate") ## End(Not run) predict(sparrowDfuncObserver) # n X 2 Observers <- data.frame(observer = levels(sparrowDf$observer)) predict(sparrowDfuncObserver, newdata = Observers) # 5 X 2 predict(sparrowDfuncObserver, type = "dfunc") # nd X n predict(sparrowDfuncObserver, newdata = Observers, type = "dfunc") # nd X 5 d <- setUnits(c(0, 150, 400), "ft") predict(sparrowDfuncObserver , newdata = Observers , distances = d , type = "dfunc") # 3 X 5 # Density and abundance by transect predict(sparrowDfuncObserver , type = "density")data("sparrowDf") # For dimension checks: nd <- getOption("Rdistance_intEvalPts") # No covariates dfuncObs <- sparrowDf |> dfuncEstim(formula = dist ~ 1 , w.hi = units::as_units(100, "m")) n <- nrow(dfuncObs$mf) p <- predict(dfuncObs) # parameters all(dim(p) == c(n, 1)) # values in newdata ignored because no covariates p <- predict(dfuncObs, newdata = data.frame(x = 1:5)) all(dim(p) == c(5, 1)) # Distance functions in columns, one per observation p <- predict(dfuncObs, type = "dfunc") all(dim(p) == c(nd, n)) d <- setUnits(c(0, 20, 40), "ft") p <- predict(dfuncObs, distances = d, type = "dfunc") all(dim(p) == c(3, n)) p <- predict(dfuncObs , newdata = data.frame(x = 1:5) , distances = d , type = "dfunc") all(dim(p) == c(3, 5)) # Covariates data(sparrowDfuncObserver) # pre-estimated object ## Not run: # Command to generate 'sparrowDfuncObserver' sparrowDfuncObserver <- sparrowDf |> dfuncEstim(formula = dist ~ observer , likelihood = "hazrate") ## End(Not run) predict(sparrowDfuncObserver) # n X 2 Observers <- data.frame(observer = levels(sparrowDf$observer)) predict(sparrowDfuncObserver, newdata = Observers) # 5 X 2 predict(sparrowDfuncObserver, type = "dfunc") # nd X n predict(sparrowDfuncObserver, newdata = Observers, type = "dfunc") # nd X 5 d <- setUnits(c(0, 150, 400), "ft") predict(sparrowDfuncObserver , newdata = Observers , distances = d , type = "dfunc") # 3 X 5 # Density and abundance by transect predict(sparrowDfuncObserver , type = "density")
An internal prediction function to predict (compute)
the values of distance functions at a set of observed values.
Unlike predDfuncs, which evaluates distance
functions at EVERY input distance, this routine evaluates
distance functions at only ONE distance. This is what's
appropriate for likelihood computation.
This version allows for matrix inputs and
uses matrix operations, and is thus faster than earlier
looping versions.
predLikelihood(object, params)predLikelihood(object, params)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
params |
A matrix of distance function parameters. Rows are observations, columns contain the set of parameters (canonical and expansion) for each observation. |
Assuming L is the vector returned by this function,
the negative log likelihood is -sum(log(L / I), na.rm=T),
where I is the integration constant, or
area under the likelihood between
w.lo and w.hi.
Note that returned likelihood values for distances less
than w.lo or greater than w.hi are NA;
hence, na.rm=TRUE in the sum.
A vector of distance function values, of length
n = number of observed distances = length(distances(x)).
Elements in distances(x) correspond, in order,
to values in the returned vector.
Print an object of class c("abund","dfunc")
produced by abundEstim.
## S3 method for class 'abund' print(x, ...)## S3 method for class 'abund' print(x, ...)
x |
An object output by |
... |
Included for compatibility to other print methods. Ignored here. |
0 is invisibly returned
dfuncEstim(), abundEstim(),
summary.dfunc(), print.dfunc(),
summary.abund()
# Load example sparrow data (line transect survey type) data(sparrowDf) # Fit half-normal detection function dfunc <- sparrowDf |> dfuncEstim(formula=dist~groupsize(groupsize)) # Estimate abundance given a detection function fit <- abundEstim(object = dfunc , area = setUnits(4105, "km^2") , ci = NULL) print(fit) summary(fit) ## Not run: # Bootstrap confidence intervals (500 iterations) # Requires ~4 min fit <- abundEstim(object = dfunc , area = setUnits(4105, "km^2") , ci = 0.95 , plot.bs = TRUE , showProgress = TRUE) print(fit) summary(fit) ## End(Not run)# Load example sparrow data (line transect survey type) data(sparrowDf) # Fit half-normal detection function dfunc <- sparrowDf |> dfuncEstim(formula=dist~groupsize(groupsize)) # Estimate abundance given a detection function fit <- abundEstim(object = dfunc , area = setUnits(4105, "km^2") , ci = NULL) print(fit) summary(fit) ## Not run: # Bootstrap confidence intervals (500 iterations) # Requires ~4 min fit <- abundEstim(object = dfunc , area = setUnits(4105, "km^2") , ci = 0.95 , plot.bs = TRUE , showProgress = TRUE) print(fit) summary(fit) ## End(Not run)
Print method for distance function objects produced
by dfuncEstim.
## S3 method for class 'dfunc' print(x, ...)## S3 method for class 'dfunc' print(x, ...)
x |
An estimated detection function object, normally
produced by calling |
... |
Included for compatibility with other print methods. Ignored here. |
The input distance function (x) is returned invisibly.
dfuncEstim(), plot.dfunc(),
print.abund(), summary.dfunc()
# Load example sparrow data (line transect survey type) data(sparrowSiteData) data(sparrowDetectionData) # Fit half-normal detection function sparrowDf <- RdistDf(sparrowSiteData, sparrowDetectionData) dfunc <- sparrowDf |> dfuncEstim(formula=dist~1) dfunc# Load example sparrow data (line transect survey type) data(sparrowSiteData) data(sparrowDetectionData) # Fit half-normal detection function sparrowDf <- RdistDf(sparrowSiteData, sparrowDetectionData) dfunc <- sparrowDf |> dfuncEstim(formula=dist~1) dfunc
Optimization control parameters
are set by calls to options() (see examples).
Optimization parameters used in
Rdistance are the following:
Rdistance_maxIters: The maximum number of optimization
iterations allowed.
Rdistance_evalMax: The maximum number of objective function
evaluations allowed.
Rdistance_likeTol: Minimum change in the likelihood
between iterations required optimization to continue.
If the likelihood changes by less than this amount,
optimization stops and a solution is declared. Iteration
continues when likelihood changes exceed this value.
Rdistance_coefTol: Minimum change in model coefficients
between iterations for optimization to continue.
If the sum of squared coefficient differences changes
by less than this amount between iterations,
optimization stops and a solution is declared.
Rdistance_optimizer: Normally, this option is not set and
the default optimizer routines are used. The default optimization
routine for smooth likelihoods (i.e., listed by
differentiableLikelihoods()) is the gradient-based
method in stats::nlminb(). The default routine for
non-smooth likelihoods is the Nelder-Mead method
in stats::optim().
Not often, but occasionally optimizers fail on problem. Switching algorithms can make a poorly behaved distance function converge, particularly when parameters are near their boundaries. If users wish to override the default optimization routine, set this option to one of the following:
"optim_\<method\>": Use the "\<method\>" method in stats::optim.
For example, "optim_Nelder-Mead" uses
the Nelder-Mead. See stats::optim() for
documentation of the six available methods.
"nlminb": Uses stats::nlminb(), a finite-difference
gradient based approach.
"hookeJeeves": Uses dfoptim::hjkb(), a
derivative-free approach for continuous and discontinuous
likelihoods.
Rdistance_hessEps: A vector of parameter distances used during
computation of numeric second derivatives. These distances control
and determine variance estimates, and they may need revision when
the maximum likelihood solution is near a parameter boundary.
Should have length
1 or the number of parameters in the model. See function
secondDeriv() for further details.
Rdistance_trace: Integer scalar for the level
of information printed to the console by the optimization
routine during maximization of the likelihood. All optimizer
routines interpret a value of 0 as 'do not print any information'
or silent. Higher values produce more information. The
information produced varies among optimization routines.
Rdistance_requireUnits: A logical specifying whether measurement
units are required on distances and areas. If TRUE,
measurement units are required on off-transect and radial
distances in the input data frame. Likewise, measurement
units are required on truncation distances, scale location,
transect lengths, and study area size. If FALSE, no units are
required and input values are used as is. The FALSE options is
provided for rare cases when Rdistance functions are called
from other functions and the calling functions do not accommodate
units.
Assign units with statement like units(detectionDf$dist) <- "m"
or setUnits(w.hi, "km") or w.hi <- 150 %#% "m" or
w.hi <- 150 %m%..
Measurement units of
the various physical quantities need not
be equal because appropriate conversions occur internally.
An error is thrown if differing units are not compatible.
For example, "m" (meters) cannot be converted into "ha" (hectares),
but "acres" can be converted into "ha".
Rdistance recognizes units listed in units::units::valid_udunits().
Rdistance_maxBSFailPropForWarning: The proportion of bootstrap
iterations that can fail without a warning. If the proportion
of non-convergent bootstrap iterations exceeds this
parameter, a warning about the validity of CI's is issued in
the abundance print method.
# increase number of iterations options(Rdistance_maxIters=2000) # change optimizer and decrease tolerance op <- options(list(Rdistance_optimizer="optim_Nelder-Mead" , Rdistance_likeTol=1e-6)) # change back options(op)# increase number of iterations options(Rdistance_maxIters=2000) # change optimizer and decrease tolerance op <- options(list(Rdistance_optimizer="optim_Nelder-Mead" , Rdistance_likeTol=1e-6)) # change back options(op)
Makes an Rdistance data frame from
separate transect and detection
data frames. Rdistance data frames are nested
data frames with one row per transect. Detection
information for each transect appears in a list-based
column that itself contains a
data frame. See Rdistance Data Frames.
Rdistance data frames can be constructed using calls to
dplyr::nest_by and dplyr::right_jion, with subsequent
attribute assignment (see Examples). This routine is
a convenient wrapper for those calls.
RdistDf( transectDf, detectionDf, by = NULL, pointSurvey = FALSE, observer = "single", .detectionCol = "detections", .effortCol = NULL )RdistDf( transectDf, detectionDf, by = NULL, pointSurvey = FALSE, observer = "single", .detectionCol = "detections", .effortCol = NULL )
transectDf |
A data frame with one row per transect and
columns containing information about the entire transect.
At a minimum, this data frame must contain the transect's ID so
it can be merged with |
detectionDf |
A data frame containing detection information associated with each transect. At a minimum, each row of this data frame must contain the following:
Optional columns in
|
by |
A character vector of variables to use in the join. The right-hand
side of this join identifies unique transects (unique
rows) in both |
pointSurvey |
If TRUE, observations were made from discrete points (i.e., during a point-transect survey) and distances are radial from observation point to target. If FALSE, observations were made along a continuous transect (i.e., during a line-transect survey) and distances are from target to nearest point on the transect (i.e., perpendicular to transect). |
observer |
Type of observer system. Legal values are "single" for single observer systems, "1given2" for a double observer system wherein observations made by observer 1 are tested against observations made by observer 2, "2given1" for a double observer system wherein observations made by observer 2 are tested against observations made by observer 1, and "both" for a double observer system wherein observations made by both observers are tested against the other and combined. |
.detectionCol |
Name of the list column that will
contain detection data frames. Default name is "detections".
Detection distances (LHS of |
.effortCol |
For continuous line transects,
this specifies the name of a column in |
For valid bootstrap estimates of confidence intervals (computed in abundEstim()),
each row of the nested data frame must represent one transect (more generally,
one sampling unit), and none should
be duplicated. The combination of transect columns
in by (i.e., the LHS of the merge, or "a" and "b" of
by = c("a" = "d", "b" = "c") for example)
should specify unique transects and unique rows of
transectDf. Warning: If by
does not specify unique rows of transectDf, dplyr::left_join,
which is called internally, will perform a many-to-many merge without
warning, and this normally duplicates both
transects and detections.
A nested tibble (a generalization of base data frames)
with one row per transect, and detection
information in a list column. Technically, the return is
a grouped tibble from
the tibble package with one row per group, and a list
column containing detection information.
Survey type, observer system, and name of the effort column
are recorded
as attributes (transType, obsType, and effortColumn, respectfully).
The return prints nicely using methods
in package tibble. If returned objects print strangely,
attach library tibble. A summary method tailored to distance sampling
is available (i.e., summary(return)).
RdistDf data frames contain the following information:
Transect Information: Each row of the data frame contains transect id and effort. Effort is transect length for line-transects, and number of points for point-transects. Optionally, transect-level covariates (such as habitat or observer id) appear on each row.
Detection Information: Observation distances (either perpendicular off-transect or radial off-point) appear in a data frame stored in a list column. If detected groups occasionally included more than one target, a group size column must be present in the list-column data frame. Optionally, detection-level covariates (such as sex or size) can appear in the data frame of the list column.
Distance Type: The type of observation distances, either
perpendicular off-transect (for line-transects studies) or
radial off-point (for point-transect studies) must appear as an
attribute of RdistDf's.
Observer Type: The type of observation system used, either
single observer or one of three types of multiple observer systems, must
appear as an attribute of RdistDf's.
Line-transects are continuous paths with targets detectable at any point. Point transects consist of one or more discrete points along a path from which observers search for targets. The length of a line-transect is it's physical length (e.g., km or miles). The 'length' of a point transect is the number of points along the transect. Single points are considered transects of length one. The length of line-transects must have a physical measurement unit (e.g., 'm' or 'ft'). The length of point-transects must be a unit-less integers (i.e., number of points).
As of Rdistance version 3.0.0, measurement units are
require on all physical distances.
Requiring units ensures that internal calculations and results
(e.g., ESW and abundance) are correct
and that output units are clear.
Physical distances are required on
off-transect distances, radial distances, truncation distances
(w.lo, unless it is zero; and w.hi, unless it is NULL),
scale locations (x.scl, unless it is zero),
line-transect lengths, and study area size. All units are
1-dimensional except those on study area, which are 2-dimensional.
Physical measurement units can vary. For example,
off-transect distances can be meters ("m"), w.hi can be inches ("in"),
and w.lo can be kilometers ("km"). Internally, all distances are
converted to the units specified by outputUnits
(or the units of input distances if
outputUnits is NULL), and
all output is reported
in units of outputUnits. Valid conversions must exist between
units or an error is thrown (e.g., meters cannot convert
into hectares).
Measurement units can be assigned using one of Rdistance's
unit helper routines (see help(unitHelpers)), Rdistance's
setUnits() function, or units::set_units()
See units::valid_udunits()
for a list of valid symbolic units.
If measurements are truly unit-less, or measurement units are unknown,
set options(Rdist_requireUnits = FALSE). This suppresses
all unit checks and conversions. Users are on their own here
and must make sure all inputs are scaled correctly so that internal
computations are correct and output units are known.
is.RdistDf(): check validity of RdistDf data frames;
dfuncEstim(): estimate distance function.
data(sparrowSiteData) data(sparrowDetectionData) sparrowDf <- RdistDf( sparrowSiteData, sparrowDetectionData ) is.RdistDf(sparrowDf, verbose = T) summary(sparrowDf) summary(sparrowDf , formula = dist ~ groupsize(groupsize) , w.hi = 100 %m%.) # Equivalent to above: sparrowDf <- sparrowDetectionData |> dplyr::nest_by( siteID , .key = "detections") |> dplyr::right_join(sparrowSiteData, by = "siteID") attr(sparrowDf, "detectionColumn") <- "detections" attr(sparrowDf, "effortColumn") <- "length" attr(sparrowDf, "obsType") <- "single" attr(sparrowDf, "transType") <- "line" is.RdistDf(sparrowDf, verbose = T) summary(sparrowDf, formula = dist ~ groupsize(groupsize)) # Condensed view: 1 row per transect (make sure tibble is attached) sparrowDf # Expansion methods: # (1) use Rdistance::unnest (includes zero transects) df1 <- unnest(sparrowDf) any( df1$siteID == "B2" ) # TRUE # Use tidyr::unnest(); but, no zero transects df2 <- tidyr::unnest(sparrowDf, cols = "detections") any( df2$siteID == "B2" ) # FALSE # Use dplyr::reframe for specific transects (e.g., for transect "B3") sparrowDf |> dplyr::filter(siteID == "B3") |> dplyr::reframe(detections) # Count detections per transect (can't use dplyr::if_else) df3 <- sparrowDf |> dplyr::reframe(nDetections = ifelse(is.null(detections), 0, nrow(detections))) sum(df3$nDetections) # Number of detections sum(df3$nDetections == 0) # Number of zero transects # Point transects data(thrasherDetectionData) data(thrasherSiteData) thrasherDf <- RdistDf( thrasherSiteData , thrasherDetectionData , pointSurvey = TRUE , by = "siteID" , .detectionCol = "detections") summary(thrasherDf, formula = dist ~ groupsize(groupsize))data(sparrowSiteData) data(sparrowDetectionData) sparrowDf <- RdistDf( sparrowSiteData, sparrowDetectionData ) is.RdistDf(sparrowDf, verbose = T) summary(sparrowDf) summary(sparrowDf , formula = dist ~ groupsize(groupsize) , w.hi = 100 %m%.) # Equivalent to above: sparrowDf <- sparrowDetectionData |> dplyr::nest_by( siteID , .key = "detections") |> dplyr::right_join(sparrowSiteData, by = "siteID") attr(sparrowDf, "detectionColumn") <- "detections" attr(sparrowDf, "effortColumn") <- "length" attr(sparrowDf, "obsType") <- "single" attr(sparrowDf, "transType") <- "line" is.RdistDf(sparrowDf, verbose = T) summary(sparrowDf, formula = dist ~ groupsize(groupsize)) # Condensed view: 1 row per transect (make sure tibble is attached) sparrowDf # Expansion methods: # (1) use Rdistance::unnest (includes zero transects) df1 <- unnest(sparrowDf) any( df1$siteID == "B2" ) # TRUE # Use tidyr::unnest(); but, no zero transects df2 <- tidyr::unnest(sparrowDf, cols = "detections") any( df2$siteID == "B2" ) # FALSE # Use dplyr::reframe for specific transects (e.g., for transect "B3") sparrowDf |> dplyr::filter(siteID == "B3") |> dplyr::reframe(detections) # Count detections per transect (can't use dplyr::if_else) df3 <- sparrowDf |> dplyr::reframe(nDetections = ifelse(is.null(detections), 0, nrow(detections))) sum(df3$nDetections) # Number of detections sum(df3$nDetections == 0) # Number of zero transects # Point transects data(thrasherDetectionData) data(thrasherSiteData) thrasherDf <- RdistDf( thrasherSiteData , thrasherDetectionData , pointSurvey = TRUE , by = "siteID" , .detectionCol = "detections") summary(thrasherDf, formula = dist ~ groupsize(groupsize))
Computes numeric second derivatives (hessian) of an arbitrary multidimensional function at a particular location.
secondDeriv(x, FUN, eps = 1e-08, ...)secondDeriv(x, FUN, eps = 1e-08, ...)
x |
The location (a vector) where the second derivatives
of |
FUN |
An R function for which the second derivatives are
sought.
This must be a function of the form FUN <- function(x, ...){...}
where x is a vector of variable parameters to FUN at which
to evaluate the 2nd derivative,
and ... are additional parameters needed to evaluate the function.
FUN must return a single value (scalar), the height of the
surface above |
eps |
A vector of small relative
distances to add to One might want to change |
... |
Any arguments passed to |
This function uses the "5-point" numeric second derivative
method advocated in numerous numerical recipe texts. During computation
of the 2nd derivative, FUN must be
capable of being evaluated at numerous locations within a hyper-ellipsoid
with cardinal radii 2*x(eps)^0.25 = 0.02x at the
default value of eps.
A handy way to use this function is to call an optimization routine
like nlminb with FUN, then call this function with the
optimized values (solution) and FUN. This will yield the hessian
at the solution and this is can produce a better
estimate of the variance-covariance
matrix than using the hessian returned by some optimization routines.
Some optimization routines return the hessian evaluated
at the next-to-last step of optimization.
An estimate of the variance-covariance matrix, which is used in
Rdistance, is solve(hessian) where hessian is
secondDeriv(<parameter estimates>, <likelihood>).
func <- function(x){-x*x} # second derivative should be -2 secondDeriv(0,func) secondDeriv(3,func) func <- function(x){3 + 5*x^2 + 2*x^3} # second derivative should be 10+12x secondDeriv(0,func) secondDeriv(2,func) func <- function(x){x[1]^2 + 5*x[2]^2} # should be rbind(c(2,0),c(0,10)) secondDeriv(c(1,1),func)func <- function(x){-x*x} # second derivative should be -2 secondDeriv(0,func) secondDeriv(3,func) func <- function(x){3 + 5*x^2 + 2*x^3} # second derivative should be 10+12x secondDeriv(0,func) secondDeriv(2,func) func <- function(x){x[1]^2 + 5*x[2]^2} # should be rbind(c(2,0),c(0,10)) secondDeriv(c(1,1),func)
Checks that the maximum likelihood optimizing routine is appropriate for the requested likelihood function. That is, checks that gradient based maximization routines are only applied to smooth likelihoods, and that gradient-free methods are used for non-smooth likelihoods.
setOptimizer(ml)setOptimizer(ml)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
If Rdistance options are changed, a list of the options and their original values. The return can be used outside this routine to set options back to their state when this routine was entered. If no options changed, the return in NULL.
Computes simple polynomial expansion terms for use in distance analysis. The Simple (and other expansions) allow "wiggle" in estimated distance functions.
simple.expansion(x, expansions)simple.expansion(x, expansions)
x |
A numeric matrix of distances at which to evaluate
the expansion series. For distance analysis, |
expansions |
A scalar specifying the number of expansion terms to compute. Must be one of the integers 1, 2, 3, 4, or 5. |
The polynomials computed here are:
First term:
Second term:
Third term:
Fourth term:
The maximum number of expansion terms computed is 4.
A 3D array of size nrow(x) X ncol(x) X expansions.
The 'pages' (3rd dimension) of this array are the cosine expansions of
x. i.e., page 1 is the first expansion term of x,
page 2 is the second expansion term of x, etc.
dfuncEstim()
, cosine.expansion()
, sine.expansion()
, hermite.expansion().
x <- matrix(seq(0, 1, length = 200), ncol = 1) simp.expn <- simple.expansion(x, 4) plot(range(x), range(simp.expn), type="n") matlines(x, simp.expn[,1,1:4], col=rainbow(4), lty = 1)x <- matrix(seq(0, 1, length = 200), ncol = 1) simp.expn <- simple.expansion(x, 4) plot(range(x), range(simp.expn), type="n") matlines(x, simp.expn[,1,1:4], col=rainbow(4), lty = 1)
Return a vector of Simpson's Composite numerical integration coefficients.
simpsonCoefs(n)simpsonCoefs(n)
n |
Number of coefficients, which is the number of points
at which the function of interest is evaluated. The number of
intervals is |
Let x be an vector of equally spaced points in the domain
of a function f (equally spaced is critical).
Let y = f(x). The numeric integral of f from min(x)
to max(x) is
sum(simpsonCoefs(length(y)) * y) * (x[2] - x[1]) / 3.
A vector of Simpson Composite rule coefficients suitable for numeric integration. The return is a vector of integers alternating between 4 and 2, with 1's on each end.
x <- seq(0, 9, length=13) y <- x^2 scoefs <- simpsonCoefs(length(x)) # exact integral is 9^3/3 = 243 sum( scoefs*y ) * (x[2] - x[1]) / 3x <- seq(0, 9, length=13) y <- x^2 scoefs <- simpsonCoefs(length(x)) # exact integral is 9^3/3 = 243 sum( scoefs*y ) * (x[2] - x[1]) / 3
Computes the sine expansion terms that modify the shape of distance likelihood functions.
sine.expansion(x, expansions)sine.expansion(x, expansions)
x |
A numeric matrix of distances at which to evaluate
the expansion series. For distance analysis, |
expansions |
A scalar specifying the number of expansion terms to compute. Must be one of the integers 1, 2, 3, 4, or 5. |
The sine expansion used here is:
First term:
Second term:
Third term:
Fourth term:
Fifth term:
The maximum number of expansion terms is 5.
The sine expansion frequency in Rdistance is pi. Each term is one pi more than the previous. The cosine expansion frequency in Rdistance is 2*pi. Consequently, the sine and cosine expansions fit different models.
A 3D array of size nrow(x) X ncol(x) X expansions.
The 'pages' (3rd dimension) of this array are the cosine expansions of
x. i.e., page 1 is the first expansion term of x,
page 2 is the second expansion term of x, etc.
dfuncEstim(),
cosine.expansion()
x <- matrix(seq(0, 1, length = 200), ncol = 1) sin.expn <- sine.expansion(x, 5) plot(range(x), range(sin.expn), type="n") matlines(x, sin.expn[,1,1:5], col=rainbow(5), lty = 1)x <- matrix(seq(0, 1, length = 200), ncol = 1) sin.expn <- sine.expansion(x, 5) plot(range(x), range(sin.expn), type="n") matlines(x, sin.expn[,1,1:5], col=rainbow(5), lty = 1)
Detection data from line transect surveys for Brewer's sparrow on 72 transects located on a 4105 km^2 study area in central Wyoming. Data were collected by Dr. Jason Carlisle of the Wyoming Cooperative Fish & Wildlife Research Unit in 2012. Each transect was 500 meters long.
A data.frame containing 356 rows and 5 columns. Each row represents a detected group of sparrows. Column descriptions:
siteID: Factor (72 levels), the site or transect where the detection
was made.
groupsize: Number, the number of individuals within
the detected group.
sightdist: Number, distance (m) from the observer to the detected group.
sightangle: Number, the angle (degrees) from the transect
line to the detected group.
dist: Number, the perpendicular, off-transect distance (m) from the
transect to the detected group. This is the distance used in analysis.
Calculated using perpDists().
The Brewer's sparrow data are a subset of the data collected by Jason Carlisle and various field technicians for his Ph.D. from the Department of Ecology, University of Wyoming, in 2017. This portion of Jason's work was funded by the Wyoming Game and Fish Department through agreements with the University of Wyoming's Cooperative Fish & Wildlife Research Unit (2012).
Carlisle, J.D. 2017. The effect of sage-grouse conservation on wildlife species of concern: implications for the umbrella species concept. Dissertation. University of Wyoming, Laramie, Wyoming, USA.
Carlisle, J. D., and A. D. Chalfoun. 2020. The abundance of Greater Sage-Grouse as a proxy for the abundance of sagebrush-associated songbirds in Wyoming, USA. Avian Conservation and Ecology 15(2):16. doi:10.5751/ACE-01702-150216
Detection data from line transect surveys for Brewer's sparrow on 72 transects located on a 4105 km^2 study area in central Wyoming collected by Dr. Jason Carlisle as part of his graduate work in the Wyoming Cooperative Fish & Wildlife Research Unit in 2012. Each transect was 500 meters long.
A rowwise tibble containing 72 rows and 9 columns, one of which
is nested data frame of detections. Each row represents
one transect. The embedded data frame in column detections
contains the detections made on the transect represented on that row.
Column descriptions:
siteID: Factor (72 levels), the transect identifier
for that row of the data frame.
length: The length, in meters [m], of each transect.
observer: Identity of the observer who surveyed the transect.
bare: The mean bare ground cover (%) within 100 [m] of the transect.
herb: The mean herbaceous cover (%) within 100 [m] of the transect.
shrub: The mean shrub cover (%) within 100 [m] of the transect.
height: The mean shrub height [cm] within 100 [m] of the transect.
shrubclass: Shrub class factor. Either "Low" when
shrub cover is < 10%, or "High" if cover >= 10%.
The embedded data frame in column detections contains the following
variables:
groupsize: The number of individuals in the detected group.
sightdist: Distance [m] from observer to the detected group.
sightangle: Angle [degrees] from the transect
line to the detected group. Not bearing. Range 0 [degrees] to 90 [degrees].
dist: Perpendicular, off-transect distance [m], from the
transect to the detected group. This is the distance used in analysis.
Calculated using perpDists().
The Brewer's sparrow data are a subset of data collected by Jason Carlisle and various field technicians for his Ph.D. from the Department of Ecology, University of Wyoming, in 2017. This portion of Jason's work was funded by the Wyoming Game and Fish Department through agreements with the University of Wyoming's Cooperative Fish & Wildlife Research Unit (2012).
Carlisle, J.D. 2017. The effect of sage-grouse conservation on wildlife species of concern: implications for the umbrella species concept. Dissertation. University of Wyoming, Laramie, Wyoming, USA.
Carlisle, J. D., and A. D. Chalfoun. 2020. The abundance of Greater Sage-Grouse as a proxy for the abundance of sagebrush-associated songbirds in Wyoming, USA. Avian Conservation and Ecology 15(2):16. doi:10.5751/ACE-01702-150216
sparrowSiteData(), sparrowDetectionData(),
RdistDf()
## Not run: # The following code generated 'sparrowDf' data(sparrowDetectionData) data(sparrowSiteData) sparrowDf <- RdistDf(transectDf = sparrowSiteData , detectionDf = sparrowDetectionData , by = "siteID" , pointSurvey = FALSE , .effortCol = "length" ) ## End(Not run) data(sparrowDf) tidyr::unnest(sparrowDf, detections) # only non-zero transects Rdistance::unnest(sparrowDf) # with zero transects at the bottom summary(sparrowDf, formula = dist ~ groupsize(groupsize) )## Not run: # The following code generated 'sparrowDf' data(sparrowDetectionData) data(sparrowSiteData) sparrowDf <- RdistDf(transectDf = sparrowSiteData , detectionDf = sparrowDetectionData , by = "siteID" , pointSurvey = FALSE , .effortCol = "length" ) ## End(Not run) data(sparrowDf) tidyr::unnest(sparrowDf, detections) # only non-zero transects Rdistance::unnest(sparrowDf) # with zero transects at the bottom summary(sparrowDf, formula = dist ~ groupsize(groupsize) )
Pre-estimated Brewer's sparrow detection function that includes an 'observer' effect. Included to speed up example execution times. See 'Examples'.
An estimated distance function object with
class 'dfunc'. See 'Value' section of
dfuncEstim() for
description of components.
sparrowSiteData() and
sparrowDetectionData() for description of the data
## Not run: # the following code generated 'sparrowDfuncObserver' data(sparrowDf) sparrowDfuncObserver <- sparrowDf |> dfuncEstim(formula = dist ~ observer , likelihood = "hazrate") ## End(Not run)## Not run: # the following code generated 'sparrowDfuncObserver' data(sparrowDf) sparrowDfuncObserver <- sparrowDf |> dfuncEstim(formula = dist ~ observer , likelihood = "hazrate") ## End(Not run)
Site data from line transect surveys for Brewer's sparrow on 72 transects located on a 4105 km^2 study area in central Wyoming. Data were collected by Dr. Jason Carlisle of the Wyoming Cooperative Fish & Wildlife Research Unit in 2012. Each transect was 500 meters long.
A data.frame containing 72 rows and 8 columns. Each row represents a site (transect) surveyed. Column descriptions:
siteID: Factor (72 levels), the site or transect surveyed.
length: Number, the length in meters ([m]) of each transect.
observer: Factor (five levels), identity of the observer
who surveyed the transect.
bare: Number, the mean bare ground cover (%) within 100 [m] of each transect.
herb: Number, the mean herbaceous cover (%) within 100 [m] of
each transect.
shrub: Number, the mean shrub cover (%) within
100 m of each transect.
height: Number, the mean shrub height
[cm] within 100 m of each transect.
shrubclass: Factor (two levels), shrub class is "Low" when
shrub cover is < 10%, "High" otherwise.
The Brewer's sparrow data are a subset of the data collected by Jason Carlisle and various field technicians for his Ph.D. from the Department of Ecology, University of Wyoming, in 2017. This portion of Jason's work was funded by the Wyoming Game and Fish Department through agreements with the University of Wyoming's Cooperative Fish & Wildlife Research Unit (2012).
Carlisle, J.D. 2017. The effect of sage-grouse conservation on wildlife species of concern: Implications for the umbrella species concept. Dissertation. University of Wyoming, Laramie, Wyoming, USA.
Carlisle, J. D., and A. D. Chalfoun. 2020. The abundance of Greater Sage-Grouse as a proxy for the abundance of sagebrush-associated songbirds in Wyoming, USA. Avian Conservation and Ecology 15(2):16. doi:10.5751/ACE-01702-150216
Returns starting values and limits (boundaries) for the parameters of
distance functions. This function is called by
other routines in Rdistance, and is not intended to
be called by the user. Replace this function in the global
environment to change boundaries and starting values.
startLimits(ml)startLimits(ml)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
A list containing the following components
start |
Vector of starting values for parameters of the likelihood and expansion terms. |
lowlimit |
Vector of lower limits for the likelihood parameters and expansion terms. |
uplimit |
Vector of upper limits for the likelihood parameters and expansion terms. |
names |
Vector of names for the likelihood parameters and expansion terms. |
The length of each vector in the return is:
(Num expansions) + 1 + 1*(like %in% c("hazrate")) + (Num Covars).
data(sparrowDf) # Half-normal start limits modList <- parseModel( data = sparrowDf , formula = dist ~ 1 , likelihood = "halfnorm" ) startLimits(modList) # Half-normal with expansions modList <- parseModel( data = sparrowDf , formula = dist ~ 1 , likelihood = "halfnorm" , expansions = 3 ) startLimits(modList) # Hazard rate start limits modList$likelihood <- "hazrate" startLimits(modList) # Neg exp start limits modList$likelihood <- "negexp" startLimits(modList)data(sparrowDf) # Half-normal start limits modList <- parseModel( data = sparrowDf , formula = dist ~ 1 , likelihood = "halfnorm" ) startLimits(modList) # Half-normal with expansions modList <- parseModel( data = sparrowDf , formula = dist ~ 1 , likelihood = "halfnorm" , expansions = 3 ) startLimits(modList) # Hazard rate start limits modList$likelihood <- "hazrate" startLimits(modList) # Neg exp start limits modList$likelihood <- "negexp" startLimits(modList)
Summarize an object of class c("abund","dfunc")
that is output by abundEstim.
## S3 method for class 'abund' summary(object, criterion = "AICc", ...)## S3 method for class 'abund' summary(object, criterion = "AICc", ...)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
criterion |
A string specifying the model fit criterion to print.
Must be one of "AICc" (the default),
"AIC", or "BIC". See |
... |
Included for compatibility to other print methods. Ignored here. |
If the proportion of bootstrap iterations that failed is
greater than getOption("Rdistance_maxBSFailPropForWarning"),
a warning about the validity of CI's is issued and
a diagnostic message printed. Increasing this option to a number greater
than 1 will kill the warning (e.g., options(Rdistance_maxBSFailPropForWarning = 1.3)),
but ignoring a large number of non-convergent
bootstrap iterations may be a bad idea (i.e., validity of the CI is
questionable). The default value for Rdistance_maxBSFailPropForWarning
is 0.2.
0 is invisibly returned.
dfuncEstim(), abundEstim(),
summary.dfunc(), print.dfunc(),
print.abund()
# Load example sparrow data (line transect survey type) data(sparrowDf) # Fit half-normal detection function dfunc <- sparrowDf |> dfuncEstim(formula=dist ~ 1 + offset(groupsize)) # Estimate abundance given the detection function fit <- abundEstim(dfunc , area = setUnits(4105, "km^2") , ci=NULL) summary(fit) # No confidence intervals ## Not run: # With bootstrap confidence intervals # Requires ~3 min to complete fit <- abundEstim(dfunc , area = setUnits(4105, "km^2") , ci=0.95) summary(fit) ## End(Not run)# Load example sparrow data (line transect survey type) data(sparrowDf) # Fit half-normal detection function dfunc <- sparrowDf |> dfuncEstim(formula=dist ~ 1 + offset(groupsize)) # Estimate abundance given the detection function fit <- abundEstim(dfunc , area = setUnits(4105, "km^2") , ci=NULL) summary(fit) # No confidence intervals ## Not run: # With bootstrap confidence intervals # Requires ~3 min to complete fit <- abundEstim(dfunc , area = setUnits(4105, "km^2") , ci=0.95) summary(fit) ## End(Not run)
A summary method for distance functions.
Distance functions are produced by
dfuncEstim (class dfunc).
## S3 method for class 'dfunc' summary(object, criterion = "AICc", ...)## S3 method for class 'dfunc' summary(object, criterion = "AICc", ...)
object |
An Rdistance model frame or fitted distance function,
normally produced by a call to |
criterion |
A string specifying the model fit criterion to print.
Must be one of "AICc" (the default),
"AIC", or "BIC". See |
... |
Included for compatibility with other print methods. Ignored here. |
This function prints the following quantities:
‘Call’ : The original function call.
‘Coefficients’ : A matrix of estimated coefficients, their standard errors, and Wald Z tests.
‘Strip’ : The left (w.lo) and right (w.hi) truncation values.
‘Effective strip width or detection radius’ : ESW or EDR as computed by effectiveDistance.
‘Probability of Detection’ : Probability of detecting a single target in the strip.
‘Scaling’ : The horizontal and vertical coordinates used to scale the distance function. Usually, the horizontal coordinate is 0 and the vertical coordinate is 1 (i.e., g(0) = 1).
‘Log likelihood’ : Value of the maximized log likelihood.
‘Criterion’ : Value of the specified fit criterion (AIC, AICc, or BIC).
The number of digits used in the printout is
controlled by options()$digits.
The input distance function object (object), invisibly,
with the following additional components:
convMessage: The convergence message. If the distance function
is smoothed, the convergence message is NULL.
effDistance: The ESW or EDR.
pDetect: Probability of detection in the strip.
AIC: AICc, AIC, or BIC of the fit, whichever was requested.
coefficients: If the distance function has coefficients, this
is the coefficient matrix with standard errors, Wald Z values, and p values.
If the distance function is smoothed, it has no coefficients and this component
is NULL.
dfuncEstim(), plot.dfunc(),
print.abund(), print.abund()
# Load example sparrow data (line transect survey type) data(sparrowDf) # Fit half-normal detection function dfunc <- sparrowDf |> dfuncEstim(formula=dist~1) # Print results summary(dfunc) summary(dfunc, criterion="BIC")# Load example sparrow data (line transect survey type) data(sparrowDf) # Fit half-normal detection function dfunc <- sparrowDf |> dfuncEstim(formula=dist~1) # Print results summary(dfunc) summary(dfunc, criterion="BIC")
Summary method for distance sampling data frames.
Rdistance data frames are rowwise tibbles. This routine is a
replacement summary method for rowwise_df's that
provides useful distance sampling descriptive statistics.
## S3 method for class 'rowwise_df' summary(object, formula = NULL, w.lo = 0, w.hi = NULL, ...)## S3 method for class 'rowwise_df' summary(object, formula = NULL, w.lo = 0, w.hi = NULL, ...)
object |
An |
formula |
A standard formula object. For example, |
w.lo |
Lower or left-truncation limit of the distances in distance data.
This is the minimum possible off-transect distance. Default is 0. If
|
w.hi |
Upper or right-truncation limit of the distances
in |
... |
Other arguments for summary methods. |
If object is an RdistDf, a data frame
containing summary statistics relevant to distance sampling is returned
invisibly.
If formula is not specified, the number of distance observations
and target detections is not returned because the distances, group sizes,
and covariates are not known.
If object is not an Rdistance data frame, return is the result of
the next summary method.
data(thrasherDf) summary(thrasherDf) summary(thrasherDf , formula = dist ~ groupsize(groupsize) , w.hi = setUnits(100,"m") )data(thrasherDf) summary(thrasherDf) summary(thrasherDf , formula = dist ~ groupsize(groupsize) , w.hi = setUnits(100,"m") )
Point transect data collected in central Wyoming from 120 points surveyed for Sage Thrashers by the Wyoming Cooperative Fish & Wildlife Research Unit in 2013.
A data.frame containing 193 rows and 3 columns. Each row represents a detected group of thrashers. Column descriptions:
siteID: Factor (120 levels), the site or point where the detection
was made.
groupsize: Number, the number of individuals within
the detected group.
dist: Number, the radial distance (m) from
the transect to the detected group. This is the distance used in analysis.
The Sage Thrasher data are a subset of the data collected by Jason Carlisle and various field technicians for his Ph.D. from the Department of Ecology, University of Wyoming, in 2017. This portion of Jason's work was funded by the Wyoming Game and Fish Department through agreements with the University of Wyoming's Cooperative Fish & Wildlife Research Unit (2012).
Carlisle, J.D. 2017. The effect of sage-grouse conservation on wildlife species of concern: implications for the umbrella species concept. Dissertation. University of Wyoming, Laramie, Wyoming, USA.
Carlisle, J. D., A. D. Chalfoun, K. T. Smith, and J. L. Beck. 2018. Nontarget effects on songbirds from habitat manipulation for Greater Sage-Grouse: Implications for the umbrella species concept. The Condor: Ornithological Applications 120:439–455. doi:10.1650/CONDOR-17-200.1
Point transect data collected in central Wyoming on 120 points surveyed for Sage Thrashers by the Wyoming Cooperative Fish & Wildlife Research Unit in 2013.
A rowwise tibble containing 120 rows and 8 columns, one of which (i.e., 'detections') contains nested data frames of detections. Each row represents one transect of one point.
A data.frame containing 120 rows and 6 columns. Each row represents a surveyed site. Each surveyed site is considered one transect of one point. Column descriptions:
siteID: Factor (120 levels), the site or point surveyed.
detections: An embedded (nested) data frame containing
detections made at that point. Columns in the embedded data frame contain:
groupsize: The number of individuals in
the detected group.
dist: The radial distance (m) from
the transect to the detected group.
observer: Factor (six levels), identity of the observer who surveyed
the point.
bare: Number, the mean bare ground cover (%)
within 100 m of each point.
herb: Number, the mean herbaceous
cover (%) within 100 m of each point.
shrub: Number, the mean
shrub cover (%) within 100 m of each point.
height: Number,
the mean shrub height [cm] within 100 m of each point.
npoints: The number of point counts on the transect.
The sage thrasher data are a subset of data collected by Jason Carlisle and various field technicians for his Ph.D. from the Department of Ecology, University of Wyoming, in 2017. This portion of Jason's work was funded by the Wyoming Game and Fish Department through agreements with the University of Wyoming's Cooperative Fish & Wildlife Research Unit (2012).
Carlisle, J.D. 2017. The effect of sage-grouse conservation on wildlife species of concern: implications for the umbrella species concept. Dissertation. University of Wyoming, Laramie, Wyoming, USA.
Carlisle, J. D., A. D. Chalfoun, K. T. Smith, and J. L. Beck. 2018. Nontarget effects on songbirds from habitat manipulation for Greater Sage-Grouse: Implications for the umbrella species concept. The Condor: Ornithological Applications 120:439–455. doi:10.1650/CONDOR-17-200.1
thrasherSiteData(), thrasherDetectionData(),
RdistDf()
data(thrasherDf) is.RdistDf(thrasherDf) summary(thrasherDf, formula = dist ~ groupsize(groupsize) )data(thrasherDf) is.RdistDf(thrasherDf) summary(thrasherDf, formula = dist ~ groupsize(groupsize) )
Point transect data collected in central Wyoming from 120 points surveyed for Sage Thrashers by the Wyoming Cooperative Fish & Wildlife Research Unit in 2013.
A data.frame containing 120 rows and 6 columns. Each row represents a surveyed site (point). Column descriptions:
siteID: Factor (120 levels), the site or point surveyed.
observer: Factor (six levels), identity of the observer who surveyed
the point.
bare: Number, the mean bare ground cover (%)
within 100 m of each point.
herb: Number, the mean herbaceous
cover (%) within 100 [m] of each point.
shrub: Number, the mean
shrub cover (%) within 100 [m] of each point.
height: Number,
the mean shrub height [cm] within 100 [m] of each point.
The Sage Thrasher data are a subset of data collected by Jason Carlisle and field technicians for his Ph.D. from the Department of Ecology, University of Wyoming, in 2017. This portion of Jason's work was funded by the Wyoming Game and Fish Department through agreements with the University of Wyoming's Cooperative Fish & Wildlife Research Unit (2012).
Carlisle, J.D. 2017. The effect of sage-grouse conservation on wildlife species of concern: implications for the umbrella species concept. Dissertation. University of Wyoming, Laramie, Wyoming, USA.
Carlisle, J. D., A. D. Chalfoun, K. T. Smith, and J. L. Beck. 2018. Nontarget effects on songbirds from habitat manipulation for Greater Sage-Grouse: Implications for the umbrella species concept. The Condor: Ornithological Applications 120:439–455. doi:10.1650/CONDOR-17-200.1
Return the type of transects represented in either a fitted distance function or Rdistance data frame.
transectType(x)transectType(x)
x |
Either an estimated distance function, output by
|
This function is a simple helper function. If x is an
estimated distance object, it polls the transType attribute
of x's Rdistance nested data frame.
If x is an Rdistance nested data frame, it
polls the transType attribute.
A scalar. Either 'line' if x contains
continuous line-transect detections, or 'point' if x contains
point-transects detections. If transect type has not been assigned,
return is NULL.
Compute likelihood function for a mixture of a triangle and uniform distributions.
triangle.like(a, dist, covars, w.hi = NULL)triangle.like(a, dist, covars, w.hi = NULL)
a |
A vector or matrix of covariate
and expansion term
coefficients. If matrix, dimension is
k X p, where
k = |
dist |
A numeric vector of length n or a single-column matrix (dimension nX1) containing detection distances at which to evaluate the likelihood. |
covars |
A numeric vector of length q or a
matrix of dimension nXq containing
covariate values
associated with distances in argument |
w.hi |
A numeric scalar containing maximum distance. The right-hand cutoff or upper limit. Ignored by some likelihoods (such as halfnorm, negexp, and hazrate), but is a fixed parameter in other likelihoods (such as oneStep and uniform). |
Rdistance's triangle likelihood is a mixture of a
triangle and uniform distribution. The 'triangle' density function
is
where
is the indicator function for event ,
and is the nominal strip width
(i.e., w.hi - w.lo in Rdistance).
The unknown parameters to be estimated
are and
( is fixed - given by the user).
Covariates influence values of
via a log link function, i.e., ,
where is the vector of covariate values
associated with distance , and
is the vector of estimated coefficients.
A list containing the following two components:
L.unscaled: A matrix of size
nXk (n = length dist; k = number of
cases = nrow(a))
containing likelihood values evaluated at
distances in dist.
Each row is associated with
a single distance, and each column is associated with
a single case (row of a). Values in this matrix are
the distance function which generally have .
These values are "unscaled" likelihood values; they must be
scaled (divided by) with the area under between w.lo and w.hi
to form proper likelihood values.
params: A nXbXk array
of the likelihood's (canonical) parameters in link space (i.e., on
log scale; b = number of canonical parameters in
the likelihood; k = number of cases).
Rows correspond to distances in dist. Columns
correspond to parameters (columns of a),
and pages correspond to cases (rows of a).
dfuncEstim(),
abundEstim(),
other <likelihood>.like functions
w <- 250 Theta <- 160 p <- 0.4 d <- seq(0,w,length = w+1) y <- (1 - ((1-p)/Theta)*d)*(d <= Theta) + p*(d > Theta) plot(d, y, type="l", ylim = c(0,1), xlab = "Distance", ylab = "Probability") points(Theta, p, pch=16, col = "red") lines(c(-10,Theta), c(p,p), lty = 2, col = "grey") axis(2, at=p, label = "p", line = 1, srt = 0, tick = FALSE) lines(c(Theta,Theta), c(-1,p), lty = 2, col = "grey") axis(1, at=Theta, label = "Theta", line = 2, tick = FALSE) Theta <- 25 p <- 0.2 y <- (1 - ((1-p)/Theta)*d)*(d <= Theta) + p*(d > Theta) lines(d, y) points(Theta, p, pch=16, col = "red") lines(c(-10,Theta), c(p,p), lty = 2, col = "grey") axis(2, at=p, label = "p", line = 1, srt = 0, tick = FALSE) lines(c(Theta,Theta), c(-1,p), lty = 2, col = "grey") axis(1, at=Theta, label = "Theta", line = 2, tick = FALSE) # same as above y <- triangle.like(a = c(log(Theta), p) , dist = d , covars = matrix(1, length(d)) , w.hi = 250) lines(d, y$L.unscaled, col = "green")w <- 250 Theta <- 160 p <- 0.4 d <- seq(0,w,length = w+1) y <- (1 - ((1-p)/Theta)*d)*(d <= Theta) + p*(d > Theta) plot(d, y, type="l", ylim = c(0,1), xlab = "Distance", ylab = "Probability") points(Theta, p, pch=16, col = "red") lines(c(-10,Theta), c(p,p), lty = 2, col = "grey") axis(2, at=p, label = "p", line = 1, srt = 0, tick = FALSE) lines(c(Theta,Theta), c(-1,p), lty = 2, col = "grey") axis(1, at=Theta, label = "Theta", line = 2, tick = FALSE) Theta <- 25 p <- 0.2 y <- (1 - ((1-p)/Theta)*d)*(d <= Theta) + p*(d > Theta) lines(d, y) points(Theta, p, pch=16, col = "red") lines(c(-10,Theta), c(p,p), lty = 2, col = "grey") axis(2, at=p, label = "p", line = 1, srt = 0, tick = FALSE) lines(c(Theta,Theta), c(-1,p), lty = 2, col = "grey") axis(1, at=Theta, label = "Theta", line = 2, tick = FALSE) # same as above y <- triangle.like(a = c(log(Theta), p) , dist = d , covars = matrix(1, length(d)) , w.hi = 250) lines(d, y$L.unscaled, col = "green")
Compute starting values and limits for the triangle distance function.
triangle.start.limits(ml)triangle.start.limits(ml)
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
A list containing the following components
start |
Vector of starting values for parameters of the likelihood and expansion terms. |
lowlimit |
Vector of lower limits for the likelihood parameters and expansion terms. |
uplimit |
Vector of upper limits for the likelihood parameters and expansion terms. |
names |
Vector of names for the likelihood parameters and expansion terms. |
The length of each vector in the return is:
(Num expansions) + 1 + 1*(like %in% c("hazrate")) + (Num Covars).
# make 'model list' object # Boundary is 10, p is 100 / 120 = 0.833 library(Rdistance) whi <- 50 x <- c( runif(100, min=0, max=10), runif(20, min=10, max=whi)) x <- setUnits(x, "m") detectDf <- data.frame(transect = 1, dist = x) siteDf <- data.frame(transect = 1, length = setUnits(10,"m")) distDf <- RdistDf(siteDf, detectDf) ml <- parseModel(distDf , formula = dist ~ 1 , w.lo = 0 , w.hi = setUnits(whi, "m") ) sl <- oneStep.start.limits(ml) hist(x, n = 20) abline(v = exp(sl$start["(Intercept)"]))# make 'model list' object # Boundary is 10, p is 100 / 120 = 0.833 library(Rdistance) whi <- 50 x <- c( runif(100, min=0, max=10), runif(20, min=10, max=whi)) x <- setUnits(x, "m") detectDf <- data.frame(transect = 1, dist = x) siteDf <- data.frame(transect = 1, length = setUnits(10,"m")) distDf <- RdistDf(siteDf, detectDf) ml <- parseModel(distDf , formula = dist ~ 1 , w.lo = 0 , w.hi = setUnits(whi, "m") ) sl <- oneStep.start.limits(ml) hist(x, n = 20) abline(v = exp(sl$start["(Intercept)"]))
Unnest an RdistDf data frame by expanding the embedded 'detections' column. This unnest includes the so-called zero transects (transects without detections).
unnest(data, ...)unnest(data, ...)
data |
An |
... |
Additional arguments passed to |
An expanded data frame, without embedded data frames. Rows in the return represent with one detection or one transect. If multiple detections were made on one transect, the transect will appear on multiple rows. If no detections were made on a transect, it will appear on one row with NA detection distance.
data('sparrowDf') # tidyr::unnest() does not include zero transects detectionDf <- tidyr::unnest(sparrowDf, detections) nrow(detectionDf) any(detectionDf$siteID == "B2") # Rdistance::unnest() includes zero transects fullDf <- unnest(sparrowDf) nrow(fullDf) any(fullDf$siteID == "B2")data('sparrowDf') # tidyr::unnest() does not include zero transects detectionDf <- tidyr::unnest(sparrowDf, detections) nrow(detectionDf) any(detectionDf$siteID == "B2") # Rdistance::unnest() includes zero transects fullDf <- unnest(sparrowDf) nrow(fullDf) any(fullDf$siteID == "B2")
Estimate the variance-covariance matrix of parameters in the distance function. If the likelihood is differentiable, the variance-covariance matrix is estimated from the second derivative of the likelihood (i.e., the hessian). If the likelihood is not differentiable, the variance-covariance matrix is a matrix of 0's that are interpreted as "pending" (i.e., pending bootstrapping).
varcovarEstim(x, ml)varcovarEstim(x, ml)
x |
An estimated detection function object, normally
produced by calling |
ml |
Either a Rdistance 'model frame' or an Rdistance
'fitted object'. Both are of class "dfunc".
Rdistance 'model frames' are lists containing components
necessary to estimate a distance function, but no estimates.
Rdistance 'model frames' are typically
produced by calls to |
A square symmetric matrix estimating the
variance-covariance matrix of parameters in x.
Dimension of return is p X p, where p = length(x$par).