How to improve loss and avoid overfitting



I'm trying to build a 2 class image classifier using the architecture suggested in first part of this blog

My dataset has 1500 images of class1 and 500 images of class2. I created 4 copies of class2 to make the number of images same in both classes.

I also use ImageDataGenerator to augment images

datagen = ImageDataGenerator(

I'm using AdaDelta,RMSProp, SGD, Adam, AdaGrad optimizers and tried adding/removing Conv2D and Dense layers. Tried BatchNormalizationa and Dropout. The results are coming out almost same:

For first few epochs(about 20) training and validation errors keep reducing until log loss reaches about 0.4 (best I have got till now) after that the model starts to overfit and validation loss keeps increasing.

I know I can prevent overfitting by reducing the network complexity and adding dropouts but that reduces the training accuracy too.

Please suggest some tips to improve the accuracy and avoid overfitting.

Amit Khanna

Posted 2018-04-09T19:20:57.697

Reputation: 161

Have you tried tuning L1 and L2 regularization? – Gaius – 2018-04-09T20:07:50.550

Didn't try that. Should I try L1 L2 with or with out dropout and and batchnormalization ? – Amit Khanna – 2018-04-09T20:32:52.453



I am not familiar with the software you are using but keep in mind: You EXPECT accuracy to drop if you reduce over fitting. It is not a bad thing. Over-fitting is essentially "fake accuracy".

Some good approaches in general to avoid over-fitting though: Use cross-validation, normalize your features, increase size of data-set and dont just increase your data-set by copying data.

Julian Kurz

Posted 2018-04-09T19:20:57.697

Reputation: 61

1Just to be clear, you expect training accuracy to decrease when you reduce overfitting. Test accuracy will increase, which is, of course, the whole point. – Ray – 2018-04-11T22:25:59.700

Isn't cross-validation for assessing a model? How it could imporove the performance of the model? – Scott – 2020-05-08T07:53:10.423


There are a few things you can do to reduce over-fitting.

  1. Use Dropout increase its value and increase the number of training epochs
  2. Increase Dataset by using Data augmentation
  3. Tweak your CNN model by adding more training parameters. Reduce Fully Connected Layers.
  4. Change the whole Model
  5. Use Transfer Learning (Pre-Trained Models)

Syed Nauyan Rashid

Posted 2018-04-09T19:20:57.697

Reputation: 461

Can you explain what adding more training parameters means? – Anshuman Kumar – 2020-04-23T07:52:37.330

Can you elaborate on 4) Change the whole Model ? Isn't any model good enough to approximate function by Universal Approximation theorem? – Prasanjit Rath – 2021-01-11T12:11:33.417


You can use dropout which will help in controlling the model to over train. While using CNN,data-set plays an important role.The more data-set the more model can learn features from it. You can divide your data-set in 3 parts. Training ,testing and validation.

Rohit Jere

Posted 2018-04-09T19:20:57.697

Reputation: 61


Here are few things you can try to reduce overfitting:

  1. Use batch normalization
  2. add dropout layers
  3. Increase the dataset
  4. Use batch size as large as possible (I think you are using 32 go with 64)
  5. to generate image dataset use flow from data
  6. Use l1 and l2 regularizes in conv layers
  7. If dataset is big increase the layers in neural network.
  8. USE callbacks tf.keras.callbacks.ReduceLROnPlateau here

Hope this mat help, also plot the history graph of training to have a better understanding of your model.

Anant Gupta

Posted 2018-04-09T19:20:57.697

Reputation: 13