3

1

Just recently got my hands on tensorboard, but can you tell me what features should I look for in the graph (Accuracy and Validation Accuracy) And please do enlighten me about the concept of underfitting as well.

3

1

Just recently got my hands on tensorboard, but can you tell me what features should I look for in the graph (Accuracy and Validation Accuracy) And please do enlighten me about the concept of underfitting as well.

6

Overfitting is a scenario where your model performs well on training data but performs poorly on data not seen during training. This basically means that your model has memorized the training data instead of learning the relationships between features and labels.

If you are familiar with the bias/variance tradeoff, then you can think of overfitting as a situation where your model has high variance, memorizing the random noise in the training set.

Overfitting is easy to diagnose with the accuracy visualizations you have available. If "Accuracy" (measured against the training set) is very good and "Validation Accuracy" (measured against a validation set) is not as good, then your model is overfitting.

Underfitting is the opposite counterpart of overfitting wherein your model exhibits high bias. This situation can occur when your model is not sufficiently complex to capture the relationship between features and labels (or if your model is too strictly regularized).

Underfitting is a bit harder to diagnose. If Accuracy and Validation Accuracy are similar but are both poor, then you may be underfitting.

**Edit 1:**

**Strategies to avoid overfitting/underfitting**

Recall that overfitting is caused by the model memorizing the training data instead of learning the more-general mapping from features to labels. This commonly occurs when training a model with so many parameters that it can fit nearly any dataset. As von Neumann so eloquently put it, "With four parameters I can fit an elephant, and with five I can make him wiggle his trunk."

You can combat overfitting by *reducing the complexity* of your model (i.e. reducing the number of trainable parameters). The specifics of how this is accomplished vary depending on the learning algorithm and the domain.

For neural networks, you can use fewer layers (shallower networks), fewer neurons per layer, sparser connections between the layers (as in convolutional nets), or regularization techniques like dropout.

In the same vein, you can combat underfitting by *increasing the complexity* of your model. This has been the driving force behind the push for ever-deeper neural networks in recent years. With more layers, the network can learn more sophisticated relationships and perhaps perform well on difficult real-world tasks.

Of course, you can't recklessly add layers to a network and expect great performance. Training deep neural networks is *hard*, for a number of statistical and technical reasons (one of which is avoiding overfitting).

So to answer your question directly: If your network is overfitting, adding more layers will almost certainly make the problem worse, since you're *increasing* model complexity. If your network is underfitting, adding more layers *can* help, but it's rarely so straightforward. You need to think carefully about how you expect the network to operate and what strategies you can employ to ensure that it doesn't begin to overfit.

**Edit 2:**

P.S. If you're new to the field of machine learning, then it may be helpful to experiment with more intuitive models than neural nets while learning about over/underfitting and the bias/variance tradeoff.

I would recommend decision trees (or model trees for regression problems). Tree-based models are easy to interpret, and playing around with parameters like the max depth and minimum impurity decrease of a decision tree might help you gain some intuition about the relationship of model complexity to overfitting.

1Can you tell me possible fixes for overfitting? And also tell, Is adding more layer gonna solve overfitting or underfitting or will it make it much worse? – Nikhil.Nixel – 2019-06-05T14:59:51.647

Sure! I'll edit my post – zachdj – 2019-06-05T15:56:50.590

That's just nice data you gave me! – Nikhil.Nixel – 2019-06-07T13:44:06.183

1You should carify on 'Underfitting is the opposite counterpart of overfitting wherein your model exhibits high variance and low bias.' as one might think underfitting is when model has high variance which is false – Suraj Motaparthy – 2020-02-24T01:24:10.053

@SurajMotaparthy Thank you for the correction. Good catch! I've fixed the answer – zachdj – 2020-02-24T14:27:10.927

The accuracy between training and test data is a good clue. Confusion matrices for binary classification. – M__ – 2019-06-05T14:55:27.847