Validation loss is lower than the training loss



I am using autoencoder for anomaly detection in warranty data.

Architecture 1: enter image description here

The plot shows the training vs validation loss based on Architecture 1.

As we see in the plot, validation loss is lower than the train loss which is totally weird.

Based on the post Validation loss lower than training loss, I understood that it is because of the dropout layer in my model. So I ran the model with dropout layer.

Architecture 2:

enter image description here

Based on the above architecture, I plotted the training vs validation loss. Now validation loss is a bit higher than the training loss.

enter image description here

Clearly it is because of the dropout layer.

Now my question is, whether the model based on Architecture 1 is correct or not? If it is not correct, what kind of changes I could do to make it work?

Thank you ! Any help is much appreciated!!


Posted 2018-10-14T13:16:02.000

Reputation: 205



It is certainly correct in the sense that it is a legitimate neural network. The dropout layer introduces noise that is not injected during the test period. The goal is to combat overfitting so that the error in your test set will be lower due to better generalization. Applying the dropout layer on top of the input layer however throws away a lot of information, meaning it will be difficult to learn a good strategy. Usually dropout layers are applied after the first transformations, which means the neural network has to rely on multiple different transformations that will mean something similar, reducing complexity. However, if you put it over your input layer you are throwing away information that is not accessible in any other way,

Jan van der Vegt

Posted 2018-10-14T13:16:02.000

Reputation: 8 538