Avoid iterations while calculating average model accuracy

3

I am fitting a model in R.

  • use createFolds method to create several k folds from the data set
  • loop through the folds, repeating the following on each iteration:
    • train the model on k-1 folds
    • predict the outcomes for the i-th fold
    • calculate prediction accuracy
  • average the accuracy

Does R have a function that makes folds itself, repeats model tuning/predictions and gives the average accuracy back?

IharS

Posted 2014-08-06T09:03:20.857

Reputation: 4 894

Answers

4

Yes, you can do all this using the Caret (http://caret.r-forge.r-project.org/training.html) package in R. For example,

fitControl <- trainControl(## 10-fold CV
                           method = "repeatedcv",
                           number = 10,
                           ## repeated ten times
                           repeats = 10)

gbmFit1 <- train(Class ~ ., data = training,
                 method = "gbm",
                 trControl = fitControl,
                ## This last option is actually one
                ## for gbm() that passes through
                verbose = FALSE)
gbmFit1

which will give the output

Stochastic Gradient Boosting 

157 samples
 60 predictors
  2 classes: 'M', 'R' 

No pre-processing
Resampling: Cross-Validated (10 fold, repeated 10 times) 

Summary of sample sizes: 142, 142, 140, 142, 142, 141, ... 

Resampling results across tuning parameters:

  interaction.depth  n.trees  Accuracy  Kappa  Accuracy SD  Kappa SD
  1                  50       0.8       0.5    0.1          0.2     
  1                  100      0.8       0.6    0.1          0.2     
  1                  200      0.8       0.6    0.09         0.2     
  2                  50       0.8       0.6    0.1          0.2     
  2                  100      0.8       0.6    0.09         0.2     
  2                  200      0.8       0.6    0.1          0.2     
  3                  50       0.8       0.6    0.09         0.2     
  3                  100      0.8       0.6    0.09         0.2     
  3                  200      0.8       0.6    0.08         0.2     

Tuning parameter 'shrinkage' was held constant at a value of 0.1
Accuracy was used to select the optimal model using  the largest value.
The final values used for the model were n.trees = 150, interaction.depth = 3     
and shrinkage = 0.1.

Caret offers many other options as well so should be able to suit your needs.

mike1886

Posted 2014-08-06T09:03:20.857

Reputation: 915

Thanks a lot. As I see, it's all about learning Caret package.. – IharS – 2014-08-06T13:14:47.953