Data Mining Algorithms In R/Packages/RWeka/Weka classifiers functions

From Wikibooks, open books for an open world
Jump to: navigation, search

Description[edit]

R interfaces to Weka regression and classification function learners.

Usage[edit]

LinearRegression(formula, data, subset, na.action, control = Weka_control(), options = NULL)

Logistic(formula, data, subset, na.action, control = Weka_control(), options = NULL)

SMO(formula, data, subset, na.action, control = Weka_control(), options = NULL)

Arguments[edit]

formula, a symbolic description of the model to be fit.

data, an optional data frame containing the variables in the model.

subset, an optional vector specifying a subset of observations to be used in the fitting process.

na.action, a function which indicates what should happen when the data contain NAs.

control, an object of class Weka_control giving options to be passed to the Weka learner.

options, a named list of further options, or NULL (default).

Details[edit]

There are a predict method for predicting from the fitted models, and a summary method based on evaluate_Weka_classifier. LinearRegression builds suitable linear regression models, using the Akaike criterion for model selection. Logistic builds multinomial logistic regression models based on ridge estimation. SMO implements John C. Platt’s sequential minimal optimization algorithm for training a support vector classifier using polynomial or RBF kernels. Multi-class problems are solved using pairwise classification.

The model formulae should only use the ‘+’ and ‘-’ operators to indicate the variables to be included or not used, respectively. Argument options allows further customization. Currently, options model and instances (or partial matches for these) are used: if set to TRUE, the model frame or the corresponding Weka instances, respectively, are included in the fitted model object, possibly speeding up subsequent computations on the object.

Value[edit]

A list inheriting from classes Weka_functions and Weka_classifiers with components including: classifier, a reference (of class jobjRef) to a Java object obtained by applying the Weka buildClassifier method to build the specified model using the given control options.

predictions, a numeric vector or factor with the model predictions for the training instances (the results of calling the Weka classifyInstance method for the built classifier and each instance).

call, the matched call.

Example[edit]

   LinearRegression(mpg ~ ., data = mtcars)
   step(lm(mpg ~ ., data = mtcars), trace = 0)
   LinearRegression(weight ~ feed, data = chickwts)
   STATUS <- factor(infert$case, labels = c("control", "case"))
   Logistic(STATUS ~ spontaneous + induced, data = infert)
   glm(STATUS ~ spontaneous + induced, data = infert, family = binomial())
   SMO(Species ~ ., data = iris, control = Weka_control(K = list("weka.classifiers.functions.supportVector.RBFKernel", G = 2)))
   SMO(Species ~ ., data = iris, control = Weka_control(K = list("RBFKernel", G = 2)))