Return the design matrix for a fitted model, with some additional options.

getX(
  mod,
  data = NULL,
  contrasts = NULL,
  add.data = FALSE,
  centre = FALSE,
  scale = FALSE,
  as.df = FALSE,
  merge = FALSE,
  env = NULL
)

Arguments

mod

A fitted model object, or a list or nested list of such objects. Can also be a model formula(s) or character vector(s) of term names (in which case data must be supplied).

data

An optional dataset, used to refit the model(s) and/or construct the design matrix.

contrasts

Optional, a named list of contrasts to apply to factors (see the contrasts.arg argument of model.matrix() for specification). These will override any existing contrasts in the data or model call.

add.data

Logical, whether to append data not specified in the model formula (with factors converted to dummy variables).

centre, scale

Logical, whether to mean-centre and/or scale terms by standard deviations (for interactions, this is carried out prior to construction of product terms). Alternatively, a numeric vector of means/standard deviations (or other statistics) can be supplied, whose names must match term names.

as.df

Logical, whether to return the matrix as a data frame (without modifying names).

merge

Logical. If TRUE, and mod is a list or nested list, a single matrix containing all terms is returned (variables must be the same length).

env

Environment in which to look for model data (if none supplied). Defaults to the formula() environment.

Value

A matrix or data frame of model(s) terms, or a list or nested list of same.

Details

This is primarily a convenience function to enable more flexible construction of design matrices, usually for internal use and for further processing. Use cases include processing and/or return of terms which may not be present in a typical design matrix (e.g. constituents of product terms, dummy variables).

See also

Examples

# Model design matrix (original)
m <- shipley.growth[[3]]
x1 <- model.matrix(m)
x2 <- getX(m)
stopifnot(all.equal(x1, x2, check.attributes = FALSE))

# Using formula or term names (supply data)
d <- shipley
x1 <- getX(formula(m), data = d)
x2 <- getX(names(lme4::fixef(m)), data = d)
stopifnot(all.equal(x1, x2))

# Scaled terms
head(getX(m, centre = TRUE, scale = TRUE))
#>   (Intercept)       Date       DD       lat
#> 1           1 -1.4031190 1.636031 -2.792213
#> 2           1 -1.0345918 1.482206 -2.792213
#> 3           1 -1.3554697 1.573358 -2.792213
#> 4           1 -1.9566917 1.690329 -2.792213
#> 5           1 -0.7276695 1.325356 -2.792213
#> 6           1 -1.5583973 1.640097 -2.792213

# Combined matrix for SEM
head(getX(shipley.sem, merge = TRUE))
#>   (Intercept)      lat       DD     Date   Growth
#> 1           1 40.38063 160.5703 115.4956 61.36852
#> 2           1 40.38063 158.9896 118.4959 43.77182
#> 3           1 40.38063 159.9262 115.8836 44.74663
#> 4           1 40.38063 161.1282 110.9889 48.20004
#> 5           1 40.38063 157.3778 120.9946 50.02237
#> 6           1 40.38063 160.6120 114.2315 56.29615
head(getX(shipley.sem, merge = TRUE, add.data = TRUE))  # add other variables
#>   (Intercept)      lat site tree year     Date       DD   Growth  Survival Live
#> 1           1 40.38063    1    1 1970 115.4956 160.5703 61.36852 0.9996238    1
#> 2           1 40.38063    1    2 1970 118.4959 158.9896 43.77182 0.8433521    1
#> 3           1 40.38063    1    3 1970 115.8836 159.9262 44.74663 0.9441110    1
#> 4           1 40.38063    1    4 1970 110.9889 161.1282 48.20004 0.9568525    1
#> 5           1 40.38063    1    5 1970 120.9946 157.3778 50.02237 0.9759584    1
#> 6           1 40.38063    1    1 1972 114.2315 160.6120 56.29615 0.9983398    1