Skip to contents

Generate predicted values for SEM direct, indirect, or total effects.

Usage

predEff(
  mod,
  newdata = NULL,
  effects = NULL,
  eff.boot = NULL,
  re.form = NA,
  type = c("link", "response"),
  interaction = NULL,
  use.raw = FALSE,
  ci.conf = 0.95,
  ci.type = "bca",
  digits = 3,
  bci.arg = NULL,
  parallel = "no",
  ncpus = NULL,
  cl = NULL,
  ...
)

Arguments

mod

A fitted model object, or a list or nested list of such objects.

newdata

An optional data frame of new values to predict, which should contain all the variables named in effects or all those used to fit mod.

effects

A numeric vector of effects to predict, or a list or nested list of such vectors. These will typically have been calculated using semEff(), bootEff(), or stdEff(). Alternatively, a boot object produced by bootEff() can be supplied.

eff.boot

A matrix of bootstrapped effects used to calculate confidence intervals for predictions, or a list or nested list of such matrices. These will have been calculated using semEff() or bootEff().

re.form

For mixed models of class "merMod", the formula for random effects to condition on when predicting effects. Defaults to NA, meaning random effects are averaged over. See lme4:::predict.merMod() for further specification details.

type

The type of prediction to return (for GLMs). Can be either "link" (default) or "response".

interaction

An optional name of an interactive effect, for which to return standardised effects for a 'main' continuous variable across different values or levels of interacting variables (see Details).

use.raw

Logical, whether to use raw (unstandardised) effects for all calculations (if present).

ci.conf

A numeric value specifying the confidence level for confidence intervals on predictions (and any interactive effects).

ci.type

The type of confidence interval to return (defaults to "bca" – see Details). See boot::boot.ci() for further specification details.

digits

The number of significant digits to return for interactive effects.

bci.arg

A named list of any additional arguments to boot::boot.ci(), excepting argument index.

parallel

The type of parallel processing to use for calculating confidence intervals on predictions. Can be one of "snow", "multicore", or "no" (for none – the default).

ncpus

Number of system cores to use for parallel processing. If NULL (default), all available cores are used.

cl

Optional cluster to use if parallel = "snow". If NULL (default), a local cluster is created using the specified number of cores.

...

Arguments to stdEff().

Value

A numeric vector of the predictions, or, if bootstrapped effects are supplied, a list containing the predictions and the upper and lower confidence intervals. Optional interactive effects may also be appended. Predictions may also be returned in a list or nested list, depending on the structure of mod (and other arguments).

Details

Generate predicted values for SEM direct, indirect, or total effects on a response variable, which should be supplied to effects. These are used in place of model coefficients in the standard prediction formula, with values for predictors drawn either from the data used to fit the original model(s) (mod) or from newdata. It is assumed that effects are fully standardised; however, if this is not the case, then the same centring and scaling options originally specified to stdEff() should be re-specified – which will then be used to standardise the data. If no effects are supplied, standardised (direct) effects will be calculated from the model and used to generate predictions. These predictions will equal the model(s) fitted values if newdata = NULL, unique.eff = FALSE, and re.form = NULL (where applicable).

Model-averaged predictions can be generated if averaged effects are supplied to the model in mod, or, alternatively, if weights are specified (passed to stdEff()) and mod is a list of candidate models (effects can also be passed using this latter method). For mixed model predictions where random effects are included (e.g. re.form = NULL), the latter approach should be used, otherwise the contribution of random effects will be taken from the single model instead of (correctly) being averaged over a candidate set.

If bootstrapped effects are supplied to eff.boot (or to effects, as part of a boot object), bootstrapped predictions are calculated by predicting from each effect. Confidence intervals can then be returned via bootCI(), for which the type should be appropriate for the original form of bootstrap sampling (defaults to "bca"). If the number of observations to predict is very large, parallel processing (via pSapply()) may speed up the calculation of intervals.

Predictions are always returned in the original (typically unstandardised) units of the (link-transformed) response variable. For GLMs, they can be returned in the response scale if type = "response".

Additionally, if the name of an interactive effect is supplied to interaction, standardised effects (and confidence intervals) can be returned for effects of a continuous 'main' variable across different values or levels of interacting variable(s). The name should be of the form "x1:x2...", containing all the variables involved and matching the name of an interactive effect in the model(s) terms or in effects. The values for all variables should be supplied in newdata, with the main continuous variable being automatically identified as that having the most unique values.

See also

Examples

# Predict effects (direct, total)
m <- shipley.sem
e <- shipley.sem.eff
dir <- getDirEff(e)
tot <- getTotEff(e)
f.dir <- predEff(m, effects = dir, type = "response")
f.tot <- predEff(m, effects = tot, type = "response")
f.dir$Live[1:10]
#>         1         2         3         4         5         6         7         8 
#> 0.9998907 0.9525798 0.9657500 0.9894445 0.9943723 0.9993621 0.9911463 0.9582557 
#>         9        10 
#> 0.9982749 0.9989735 
f.tot$Live[1:10]
#>         1         2         3         4         5         6         7         8 
#> 0.9975858 0.5742006 0.5783691 0.7196629 0.9436709 0.9840953 0.8998478 0.5468860 
#>         9        10 
#> 0.9462890 0.9887104 

# Using new data for predictors
d <- na.omit(shipley)
xn <- c("lat", "DD", "Date", "Growth")
seq100 <- function(x) seq(min(x), max(x), length = 100)
nd <- data.frame(sapply(d[xn], seq100))
f.dir <- predEff(m, nd, dir, type = "response")
f.tot <- predEff(m, nd, tot, type = "response")
f.dir$Live[1:10]
#>         1         2         3         4         5         6         7         8 
#> 0.3000301 0.3279412 0.3571239 0.3874066 0.4185852 0.4504280 0.4826822 0.5150813 
#>         9        10 
#> 0.5473542 0.5792338 
f.tot$Live[1:10]
#>          1          2          3          4          5          6          7 
#> 0.05467217 0.06338280 0.07337353 0.08479648 0.09781007 0.11257517 0.12924985 
#>          8          9         10 
#> 0.14798252 0.16890356 0.19211539 
# Add CIs
# dir.b <- getDirEff(e, "boot")
# tot.b <- getTotEff(e, "boot")
# f.dir <- predEff(m, nd, dir, dir.b, type = "response")
# f.tot <- predEff(m, nd, tot, tot.b, type = "response")

# Predict an interactive effect (e.g. Live ~ Growth * DD)
xn <- c("Growth", "DD")
d[xn] <- scale(d[xn])  # scale predictors (improves fit)
m <- lme4::glmer(Live ~ Growth * DD + (1 | site) + (1 | tree),
                 family = binomial, data = d)
nd <- with(d, expand.grid(
  Growth = seq100(Growth),
  DD = mean(DD) + c(-sd(DD), sd(DD))  # two levels for DD
))
f <- predEff(m, nd, type = "response", interaction = "Growth:DD")
f$fit[1:10]
#>         1         2         3         4         5         6         7         8 
#> 0.2935522 0.3230338 0.3539926 0.3862256 0.4194877 0.4534979 0.4879478 0.5225126 
#>         9        10 
#> 0.5568630 0.5906782 
f$interaction
#> Growth:DD_1 Growth:DD_2 
#>       0.393       0.286 
# Add CIs (need to bootstrap model...)
# system.time(B <- bootEff(m, R = 1000, ran.eff = "site"))
# f <- predEff(m, nd, B, type = "response", interaction = "Growth:DD")

# Model-averaged predictions (several approaches)
m <- shipley.growth  # candidate models (list)
w <- runif(length(m), 0, 1)  # weights
e <- stdEff(m, w)  # averaged effects
f1 <- predEff(m[[1]], effects = e)  # pass avg. effects
f2 <- predEff(m, weights = w)  # pass weights argument
f3 <- avgEst(predEff(m), w)  # use avgEst function
stopifnot(all.equal(f1, f2))
stopifnot(all.equal(f2, f3))

# Compare model fitted values: predEff() vs. fitted()
m <- shipley.sem$Live
f1 <- predEff(m, unique.eff = FALSE, re.form = NULL, type = "response")
f2 <- fitted(m)
stopifnot(all.equal(f1, f2))

# Compare predictions using standardised vs. raw effects (same)
f1 <- predEff(m)
f2 <- predEff(m, use.raw = TRUE)
stopifnot(all.equal(f1, f2))