Return the design matrix for a fitted model, with some additional options.
Usage
getX(
mod,
data = NULL,
contrasts = NULL,
add.data = FALSE,
centre = FALSE,
scale = FALSE,
as.df = FALSE,
merge = FALSE,
env = NULL
)
Arguments
- mod
A fitted model object, or a list or nested list of such objects. Can also be a model formula(s) or character vector(s) of term names (in which case
data
must be supplied).- data
An optional dataset, used to refit the model(s) and/or construct the design matrix.
- contrasts
Optional, a named list of contrasts to apply to factors (see the
contrasts.arg
argument ofmodel.matrix()
for specification). These will override any existing contrasts in the data or model call.- add.data
Logical, whether to append data not specified in the model formula (with factors converted to dummy variables).
- centre, scale
Logical, whether to mean-centre and/or scale terms by standard deviations (for interactions, this is carried out prior to construction of product terms). Alternatively, a numeric vector of means/standard deviations (or other statistics) can be supplied, whose names must match term names.
- as.df
Logical, whether to return the matrix as a data frame (without modifying names).
- merge
Logical. If
TRUE
, andmod
is a list or nested list, a single matrix containing all terms is returned (variables must be the same length).- env
Environment in which to look for model data (if none supplied). Defaults to the
formula()
environment.
Details
This is primarily a convenience function to enable more flexible construction of design matrices, usually for internal use and for further processing. Use cases include processing and/or return of terms which may not be present in a typical design matrix (e.g. constituents of product terms, dummy variables).
Examples
# Model design matrix (original)
m <- shipley.growth[[3]]
x1 <- model.matrix(m)
x2 <- getX(m)
stopifnot(all.equal(x1, x2, check.attributes = FALSE))
# Using formula or term names (supply data)
d <- shipley
x1 <- getX(formula(m), data = d)
x2 <- getX(names(lme4::fixef(m)), data = d)
stopifnot(all.equal(x1, x2))
# Scaled terms
head(getX(m, centre = TRUE, scale = TRUE))
#> (Intercept) Date DD lat
#> 1 1 -1.4031190 1.636031 -2.792213
#> 2 1 -1.0345918 1.482206 -2.792213
#> 3 1 -1.3554697 1.573358 -2.792213
#> 4 1 -1.9566917 1.690329 -2.792213
#> 5 1 -0.7276695 1.325356 -2.792213
#> 6 1 -1.5583973 1.640097 -2.792213
# Combined matrix for SEM
head(getX(shipley.sem, merge = TRUE))
#> (Intercept) lat DD Date Growth
#> 1 1 40.38063 160.5703 115.4956 61.36852
#> 2 1 40.38063 158.9896 118.4959 43.77182
#> 3 1 40.38063 159.9262 115.8836 44.74663
#> 4 1 40.38063 161.1282 110.9889 48.20004
#> 5 1 40.38063 157.3778 120.9946 50.02237
#> 6 1 40.38063 160.6120 114.2315 56.29615
head(getX(shipley.sem, merge = TRUE, add.data = TRUE)) # add other variables
#> (Intercept) lat site tree year Date DD Growth Survival Live
#> 1 1 40.38063 1 1 1970 115.4956 160.5703 61.36852 0.9996238 1
#> 2 1 40.38063 1 2 1970 118.4959 158.9896 43.77182 0.8433521 1
#> 3 1 40.38063 1 3 1970 115.8836 159.9262 44.74663 0.9441110 1
#> 4 1 40.38063 1 4 1970 110.9889 161.1282 48.20004 0.9568525 1
#> 5 1 40.38063 1 5 1970 120.9946 157.3778 50.02237 0.9759584 1
#> 6 1 40.38063 1 1 1972 114.2315 160.6120 56.29615 0.9983398 1