Is the test time the phase when the model's accuracy is calculated with test data set?

3

When papers talk about the "test time", does this mean the phase when the model is passed with new data instances to derive the accuracy of the test data set? Or is "test time" the phase when the model is fully trained and launched for real-world input data?

MJimitater

Posted 2020-05-13T08:55:40.750

Reputation: 133

Answers

2

If it is not defined otherwise, testing is the phase where the model is passed with new data instances to derive the score of the test set. It should not be confused with validation set.

A validation dataset is a sample of data held back from training your model that is used to give an estimate of model skill while tuning model’s hyperparameters during training. There are a lot of methods of validation with k-fold cross-validation being one of the most popular.

In k-fold cross-validation, the original training set is randomly partitioned into k equal sized subsamples. Of the k subsamples, a single subsample is retained as the validation data for validating the model, and the remaining k − 1 subsamples are used as training data. The cross-validation process is then repeated k times, with each of the k subsamples used exactly once as the validation data for the each epoch. The k results can then be averaged to produce a single estimation. The advantage of this method over repeated random sub-sampling is that all observations are used for both training and validation, and each observation is used for validation exactly once.

The validation dataset is different from the test set that is also held back from the training of the model, but is instead used to give an unbiased estimate of the skill of the final tuned model when comparing or selecting between final models.

ddaedalus

Posted 2020-05-13T08:55:40.750

Reputation: 440