Data Mining Algorithms In R/Packages/gausspred/train pred gau

From Wikibooks, open books for an open world
< Data Mining Algorithms In R‎ | Packages‎ | gausspred
Jump to: navigation, search

Description[edit]

Training with Markov chain sampling, predicting for test cases, and evaluating performance with cross-validation

Usage[edit]

training_gau (G,features,response,prior_y=rep(1,G), p_tau_nu = c(alpha=2,w=0.5), p_tau_mu = c(alpha=2,w=0.5), p_tau_x = c(alpha=2,w=0.5), nos_super = 100, nos_trans = 5, ini_taus=rep(1,3), cor=0, p=ncol(features),cutoff=1, min_qf=exp(-10), nos_lambda=1000, stepsize_log_tau=-2, no_steps=10 ) predict_gau (features, out_tr, pred_range, thin) crossvalid_gau (no_fold, k, G, features,response, lmd_trf, prior_y=rep(1,G), p_tau_nu = c(alpha=2,w=0.5), p_tau_mu = c(alpha=2,w=0.5), p_tau_x = c(alpha=2,w=0.5), nos_super = 100, nos_trans = 5, ini_taus=rep(1,3), cor=0, min_qf=exp(-10),nos_lambda=1000, stepsize_log_tau=-2, no_steps=10 , pred_range, thin =1 )

Arguments[edit]

Arguments of training_gau and crossvalid_gau: G, the number of groups, i.e., number of possibilities of response.

features, the features, with the rows for the cases.

response, the response values.

prior_y, a vector of length ’G’, specifying the Dirichlet prior distribution for probabilities of the response.

p_tau_nu, p_tau_mu, p_tau_x, vectors of 2 numbers, specifying the Gamma distribution as prior for the inverse of the variance of the distribution of µ, and features respectively; the first number is shape, the second is rate.

nos_super, nos_trans, nos_super of super Markov chain transitions are run, with nos_trans Markov chain iterations for each. Only the last state of each super transition is saved. This is used to avoid saving Markov chain state for each iteration.

cor, taking value 0 or 1, indicating whether bias-correction is to be applied.

p, the number of total features before selection. This number needs to be supplied by users other than inferred from other arguments.

cutoff, the cutoff of F-statistic used to select features. This number needs to be supplied by users other than inferred from other arguments.

min_qf, the minimum value of "f" used to cut the in?nite summation in calculating cor- rection factor.

nos_lambda, the number of random numbers for A in approximating the correction factor.

stepsize_log_tau, the stepsize of Gaussian proposal used in sampling log tau_mu when bias- correction is applied.

no_steps, iterations of Metropolis sampling for log tau_mu.

Arguments only of predict_gau:

out_tr, output of Markov chain sampling returned by training_gau.

pred_range, the range of super Markov chain transitions used to predict the response values of test cases.

thin, only 1 sample for every thin samples are used in predicting, chosen evenly.


Arguments only of crossvalid_gau:

no_fold, the number of subsets of the data in making cross-validation assessment.

k, the number of features selected.

lmd_trf, the value of lambda used to estimate covariance matrix, which is used to transform data with Choleski decomposition. The larger this number is the estimated covariance matrix is closer to diagonal.


Value[edit]

The function training_gau returns the following values:

mu, an array of three dimensions, storing Markov chain samples of µ, with the first dimension for different features, the 2nd dimension for different groups, the third dimension for different Markov chain super transitions.

nu, a matrix, storing Markov chain samples of ν, with rows for features, the columns for Markov chain iterations.

tau_x, a vector, storing Markov chain samples of τ x.

tau_mu, a vector, storing Markov chain samples of τ µ .

tau_nu, a vector, storing Markov chain samples of τ ν .

freq_y, the posterior mean of the probabilities for response.

Both predict_gau and crossvalid_gau return a matrix of the predictive probabilities, with rows for cases, columns for different groups (different values of response).

Example[edit]

n <- 200+400 p <- 400 G <- 6 p_tau_x <- c(4,1) p_tau_mu <- c(1.5,0.01) p_tau_nu <- c(1.5,0.01) tau_mu <- 100

data <- gen_bayesgau (n,p,G,tau_nu=100,tau_mu,p_tau_x )

ix_tr <- 1:200

i_sel <- order_features(data$X[ix_tr,],data$y[ix_tr]) vars <- i_sel$vars[1:10] cutoff <- i_sel$fstat[10]train_pred_gau 7

out_tr_cor <- training_gau( G = G, data$X[ix_tr,vars,drop=FALSE], data$y[ix_tr], prior_y = rep(1,G), p_tau_nu, p_tau_mu, p_tau_x , nos_super = 400, nos_trans = 1, ini_taus=rep(1,3),

cor=1, p=p,cutoff=cutoff, min_qf=exp(-10),nos_lambda=100, stepsize_log_tau=0.5, no_steps=5 )

out_pred_cor <- predict_gau( data$X[-(ix_tr),vars,drop=FALSE], out_tr_cor, pred_range=c(50,400), thin = 1)

Mlosser <- matrix (1,G,G) diag(Mlosser) <- 0

Mloss <- matrix(1,G,G) Mloss <- matrix(exp(rnorm(G^2,0,2)),G,G) diag(Mloss) <- 0

amlp_cor <- comp_amlp(out_pred_cor,data$y[-ix_tr])

er_cor <- comp_loss(out_pred_cor,data$y[-ix_tr],Mlosser)

l_cor <- comp_loss(out_pred_cor,data$y[-ix_tr],Mloss)