In which epoch should i stop the training to avoid overfitting



I'm working on an age estimation project trying to classify a given face in a predefined age range. For that purpose I'm training a deep NN using the keras library. The accuracy for the training and the validation sets is shown in the graph below:


As you can see the validation accuracy keeps rising with smaller steps than the training accuracy. Should i stop training at the epoch 280 in which the training and the validation accuracy have the same value or should i proceed the training process as long as the validation accuracy is rising, even thought the training accuracy value is also getting at overfitted values (eg. 93%).

Yiannis Ath

Posted 2018-05-29T09:33:48.877

Reputation: 103



As long as your validation accuracy increases, you should keep training. I would stop when the test accuracy starts decreasing (this is known as early stopping). The general advise is always to keep the model that performs the best in your validation set.

Although it is right that your model overfits a little since epoch 280, it is not necessarily a bad thing provided that your validation accuracy is high. In general, most machine learning models will have higher training accuracy compared to validation accuracy, but this doesn't have to be bad.

In a general case, you expect your accuracy to behave in the following way.

enter image description here

In your case, you're before the early stopping epoch, so even if your training set accuracy is higher than your test set accuracy, it is not necessarily an issue.

David Masip

Posted 2018-05-29T09:33:48.877

Reputation: 5 101


"Early Stopping" is the concept which needs to be used here. As mentioned in wikipedia about early stopping,

In machine learning, early stopping is a form of regularization used to avoid overfitting when training a learner with an iterative method, such as gradient descent. Such methods update the learner so as to make it better fit the training data with each iteration. Up to a point, this improves the learner's performance on data outside of the training set. Past that point, however, improving the learner's fit to the training data comes at the expense of increased generalization error. Early stopping rules provide guidance as to how many iterations can be run before the learner begins to over-fit. Early stopping rules have been employed in many different machine learning methods, with varying amounts of theoretical foundation.

At epoch > 280 in your graph, validation accuracy becomes lesser than training accuracy and hence it becomes a case of overfitting. In order to avoid overfitting here, training further is not recommended. However you may choose to train the model beyond the epoch where training and validation accuracy matches if the resulting validation accuracy is sufficient for the particular problem you are working on.

Divyanshu Shekhar

Posted 2018-05-29T09:33:48.877

Reputation: 499


Keep training until your validation accuracy saturates (or starts dropping). Since the accuracy increases slowly, try to increase your learning rate parameter etato force the network to converge faster to the optimum weights. Be aware though, if you increase it too much though, it will become unstable.


Posted 2018-05-29T09:33:48.877

Reputation: 3 275


If you're using keras or tensorflow.keras, this parameter is known as patience in the EarlyStopping callback.

It equals the number of epochs with no validation accuracy improvement to trigger the end of the training phase. I usually set it to 2 or 3, 1 is usually too sensitive to noise.

Learning is a mess

Posted 2018-05-29T09:33:48.877

Reputation: 626


You should also look for training error Vs testing error than training accuracy and testing accuracy.

Ashok Kumar Jayaraman

Posted 2018-05-29T09:33:48.877

Reputation: 133

1isn't it the same? – Francesco Pegoraro – 2018-10-06T11:24:45.583