Return the design matrix for a fitted model, with some additional options.
getX(
mod,
data = NULL,
contrasts = NULL,
add.data = FALSE,
centre = FALSE,
scale = FALSE,
as.df = FALSE,
merge = FALSE,
env = NULL
)
A fitted model object, or a list or nested list of such objects.
Can also be a model formula(s) or character vector(s) of term names (in
which case data
must be supplied).
An optional dataset, used to refit the model(s) and/or construct the design matrix.
Optional, a named list of contrasts to apply to factors (see
the contrasts.arg
argument of model.matrix()
for specification). These
will override any existing contrasts in the data or model call.
Logical, whether to append data not specified in the model formula (with factors converted to dummy variables).
Logical, whether to mean-centre and/or scale terms by standard deviations (for interactions, this is carried out prior to construction of product terms). Alternatively, a numeric vector of means/standard deviations (or other statistics) can be supplied, whose names must match term names.
Logical, whether to return the matrix as a data frame (without modifying names).
Logical. If TRUE
, and mod
is a list or nested list, a single
matrix containing all terms is returned (variables must be the same
length).
Environment in which to look for model data (if none supplied).
Defaults to the formula()
environment.
A matrix or data frame of model(s) terms, or a list or nested list of same.
This is primarily a convenience function to enable more flexible construction of design matrices, usually for internal use and for further processing. Use cases include processing and/or return of terms which may not be present in a typical design matrix (e.g. constituents of product terms, dummy variables).
# Model design matrix (original)
m <- shipley.growth[[3]]
x1 <- model.matrix(m)
x2 <- getX(m)
stopifnot(all.equal(x1, x2, check.attributes = FALSE))
# Using formula or term names (supply data)
d <- shipley
x1 <- getX(formula(m), data = d)
x2 <- getX(names(lme4::fixef(m)), data = d)
stopifnot(all.equal(x1, x2))
# Scaled terms
head(getX(m, centre = TRUE, scale = TRUE))
#> (Intercept) Date DD lat
#> 1 1 -1.4031190 1.636031 -2.792213
#> 2 1 -1.0345918 1.482206 -2.792213
#> 3 1 -1.3554697 1.573358 -2.792213
#> 4 1 -1.9566917 1.690329 -2.792213
#> 5 1 -0.7276695 1.325356 -2.792213
#> 6 1 -1.5583973 1.640097 -2.792213
# Combined matrix for SEM
head(getX(shipley.sem, merge = TRUE))
#> (Intercept) lat DD Date Growth
#> 1 1 40.38063 160.5703 115.4956 61.36852
#> 2 1 40.38063 158.9896 118.4959 43.77182
#> 3 1 40.38063 159.9262 115.8836 44.74663
#> 4 1 40.38063 161.1282 110.9889 48.20004
#> 5 1 40.38063 157.3778 120.9946 50.02237
#> 6 1 40.38063 160.6120 114.2315 56.29615
head(getX(shipley.sem, merge = TRUE, add.data = TRUE)) # add other variables
#> (Intercept) lat site tree year Date DD Growth Survival Live
#> 1 1 40.38063 1 1 1970 115.4956 160.5703 61.36852 0.9996238 1
#> 2 1 40.38063 1 2 1970 118.4959 158.9896 43.77182 0.8433521 1
#> 3 1 40.38063 1 3 1970 115.8836 159.9262 44.74663 0.9441110 1
#> 4 1 40.38063 1 4 1970 110.9889 161.1282 48.20004 0.9568525 1
#> 5 1 40.38063 1 5 1970 120.9946 157.3778 50.02237 0.9759584 1
#> 6 1 40.38063 1 1 1972 114.2315 160.6120 56.29615 0.9983398 1