Validation loss is not decreasing



I am trying to train a LSTM model. Is this model suffering from overfitting?

Here is train and validation loss graph:



Posted 2018-12-27T08:23:06.633

Reputation: 421

All the other answers assume this is an overfitting problem. While it could all be true, this could be a different problem too. Maybe your neural network is not learning at all. It's not possible to conclude with just a one chart. Check the model outputs and see whether it has overfit and if it is not, consider this either a bug or an underfitting-architecture problem or a data problem and work from that point onward. – Ammar Ameerdeen – 2020-06-16T03:17:13.800



The model is overfitting right from epoch 10, the validation loss is increasing while the training loss is decreasing.

Dealing with such a Model:

  1. Data Preprocessing: Standardizing and Normalizing the data.
  2. Model compelxity: Check if the model is too complex. Add dropout, reduce number of layers or number of neurons in each layer.
  3. Learning Rate and Decay Rate: Reduce the learning rate, a good starting value is usually between 0.0005 to 0.001. Also consider a decay rate of 1e-6.

There are many other options as well to reduce overfitting, assuming you are using Keras, visit this link.


Posted 2018-12-27T08:23:06.633

Reputation: 455

1How can we play with learning and decay rates in Keras implementation of LSTM? – user145959 – 2019-03-08T16:33:35.447


Yes this is an overfitting problem since your curve shows point of inflection. This is a sign of very large number of epochs. In this case, model could be stopped at point of inflection or the number of training examples could be increased.

Also, Overfitting is also caused by a deep model over training data. In that case, you'll observe divergence in loss between val and train very early.

Mohit Banerjee

Posted 2018-12-27T08:23:06.633

Reputation: 301


Another possible cause of overfitting is improper data augmentation. If you're augmenting then make sure it's really doing what you expect.

I had a similar problem, and it turned out to be due to a bug in my Tensorflow data pipeline where I was augmenting before caching:

def get_dataset(inputfile, batchsize):
    # Load the data into a TensorFlow dataset.
    signals, labels = read_data_from_file(inputfile)
    dataset =, labels))

    # Augment the data by dynamically tweaking each training sample on the fly.
    dataset =
                map_func=(lambda signals, labels: (tuple(tf.py_function(func=augment, inp=[signals], Tout=[tf.float32])), labels)))

    # Oops! Should have called cache() before augmenting
    dataset = dataset.cache()
    dataset = ... # Shuffle, repeat, batch, etc.
    return dataset

training_data = get_dataset("training.txt", 32)
val_data = get_dataset("validation.txt", 32), validation_data=val_data, ...)

As a result, the training data was only being augmented for the first epoch, but the validation data was being augmented on every epoch. This caused the model to quickly overfit on the training data while the validation loss continually increased. Moving the augment call after cache() solved the problem.

Kevin D.

Posted 2018-12-27T08:23:06.633

Reputation: 111