Calculates the components to predict all the dependent variables.
Arguments
- formula
an object of class
MultivariateFormula
(or one that can be coerced to that class): a symbolic description of the model to be fitted.- data
a data frame to be modeled.
- family
a vector of character of the same length as the number of dependent variables: "bernoulli", "binomial", "poisson" or "gaussian" is allowed.
- K
number of components, default is one.
- size
describes the number of trials for the binomial dependent variables. A (number of statistical units * number of binomial dependent variables) matrix is expected.
- weights
weights on individuals (not available for now)
- offset
used for the poisson dependent variables. A vector or a matrix of size: number of observations * number of Poisson dependent variables is expected.
- subset
an optional vector specifying a subset of observations to be used in the fitting process.
- na.action
a function which indicates what should happen when the data contain NAs. The default is set to
na.omit
.- crit
a list of two elements : maxit and tol, describing respectively the maximum number of iterations and the tolerance convergence criterion for the Fisher scoring algorithm. Default is set to 50 and 10e-6 respectively.
- method
structural relevance criterion. Object of class "method.SCGLR" built by
methodSR
for Structural Relevance.
Value
an object of the SCGLR class.
The function summary
(i.e., summary.SCGLR
) can be used to obtain or print a summary of the results.
An object of class "SCGLR
" is a list containing following components:
- u
matrix of size (number of regressors * number of components), contains the component-loadings, i.e. the coefficients of the regressors in the linear combination giving each component.
- comp
matrix of size (number of statistical units * number of components) having the components as column vectors.
- compr
matrix of size (number of statistical units * number of components) having the standardized components as column vectors.
- gamma
list of length number of dependant variables. Each element is a matrix of coefficients, standard errors, z-values and p-values.
- beta
matrix of size (number of regressors + 1 (intercept) * number of dependent variables), contains the coefficients of the regression on the original regressors X.
- lin.pred
data.frame of size (number of statistical units * number of dependent variables), the fitted linear predictor.
- xFactors
data.frame containing the nominal regressors.
- xNumeric
data.frame containing the quantitative regressors.
- inertia
matrix of size (number of components * 2), contains the percentage and cumulative percentage of the overall regressors' variance, captured by each component.
- logLik
vector of length (number of dependent variables), gives the likelihood of the model of each \(y_k\)'s GLM on the components.
- deviance.null
vector of length (number of dependent variables), gives the deviance of the null model of each \(y_k\)'s GLM on the components.
- deviance.residual
vector of length (number of dependent variables), gives the deviance of the model of each \(y_k\)'s GLM on the components.
References
Bry X., Trottier C., Verron T. and Mortier F. (2013) Supervised Component Generalized Linear Regression using a PLS-extension of the Fisher scoring algorithm. Journal of Multivariate Analysis, 119, 47-60.
Examples
if (FALSE) { # \dontrun{
library(SCGLR)
# load sample data
data(genus)
# get variable names from dataset
n <- names(genus)
ny <- n[grep("^gen",n)] # Y <- names that begins with "gen"
nx <- n[-grep("^gen",n)] # X <- remaining names
# remove "geology" and "surface" from nx
# as surface is offset and we want to use geology as additional covariate
nx <-nx[!nx%in%c("geology","surface")]
# build multivariate formula
# we also add "lat*lon" as computed covariate
form <- multivariateFormula(ny,c(nx,"I(lat*lon)"),A=c("geology"))
# define family
fam <- rep("poisson",length(ny))
genus.scglr <- scglr(formula=form,data = genus,family=fam, K=4,
offset=genus$surface)
summary(genus.scglr)
} # }