Data Mining Algorithms In R/Packages/RWeka/Evaluate Weka Classifier

From Wikibooks, open books for an open world
Jump to navigation Jump to search

Description[edit | edit source]

Compute model performance statistics for a fitted Weka classifier.

Usage[edit | edit source]

evaluate_Weka_classifier(object, newdata = NULL, cost = NULL, numFolds = 0, complexity = FALSE, class = FALSE, seed = NULL, ...)

Arguments[edit | edit source]

object, a Weka_classifier object.

newdata, an optional data frame in which to look for variables with which to evaluate. If omitted or NULL, the training instances are used.

cost, a square matrix of (mis)classification costs.

numFolds, the number of folds to use in cross-validation.

complexity, option to include entropy-based statistics.

class, option to include class statistics.

seed, optional seed for cross-validation.

Further arguments passed to other methods.

Details[edit | edit source]

The function computes and extracts a non-redundant set of performance statistics that is suitable for model interpretation. By default the statistics are computed on the training data.

Other arguments only supports the logical variable normalize which tells Weka to normalize the cost matrix so that the cost of a correct classification is zero.

Value[edit | edit source]

An object of class Weka_classifier_evaluation, a list of the following components:

string character, concatenation of the string representations of the performance statistics.

details vector, base statistics, e.g., the percentage of instances correctly classified, etc.

detailsComplexity vector, entropy-based statistics.

detailsClass matrix, class statistics, e.g., the true positive rate, etc., for each level of the response variable.

confusionMatrix table, cross-classification of true and predicted classes.

Example[edit | edit source]

   w <- read.arff(system.file("arff","mytest.arff",package = "RWeka"))
   m <- J48(play~., data = w)
   e <- evaluate_Weka_classifier(m,cost = matrix(c(0,2,1,0), ncol = 2),numFolds = 10, complexity = TRUE,seed = 123, class = TRUE)