Data Mining Algorithms In R/Packages/RWeka/Weka filters
Description[edit | edit source]
R interfaces to Weka filters.
Usage[edit | edit source]
Normalize(formula, data, subset, na.action, control = NULL)
Discretize(formula, data, subset, na.action, control = NULL)
Arguments[edit | edit source]
formula, a symbolic description of a model. Note that for unsupervised filters the response can be omitted.
data, an optional data frame containing the variables in the model.
subset, an optional vector specifying a subset of observations to be used in the fitting process.
na.action, a function which indicates what should happen when the data contain NAs.
control, an object of class Weka_control, or a character vector of control options, or NULL (default).
Details[edit | edit source]
Normalize implements an unsupervised filter that normalizes all instances of a dataset to have a given norm. Only numeric values are considered, and the class attribute is ignored.
Discretize implements a supervised instance filter that discretizes a range of numeric attributes in the dataset into nominal attributes. Discretization is by Fayyad & Irani’s MDL method (the default).
Note that these methods ignore nominal attributes, i.e., variables of class factor.
Value[edit | edit source]
A data frame
Example[edit | edit source]
w <- read.arff(system.file("arff","weather.arff", package = "RWeka")) m1 <- Normalize(~., data = w) m2 <- Discretize(play ~., data = w)