Question on bias-variance tradeoff and means of optimization

8

1

So I was wondering how does one, for example, can best optimize the model they are trying to build when confronted with issues presented by high bias or high variance. Now, of course, you can play with the regularization parameter to get to a satisfying end, but I was wondering whether it is possible to do this without relying on regularization.

If b is the bias estimator of a model and v of its variance, wouldn't it make sense to try to minimize b*v?

Zer0k

Posted 2018-04-12T20:19:53.473

Reputation: 155

Answers

10

There are a lot of ways bias and variance can be minimized and despite the popular saying it isn't always a tradeoff.

The two main reasons for high bias are insufficient model capacity and underfitting because the training phase wasn't complete. For example, if you have a very complex problem to solve (e.g. image recognition) and you use a model of low capacity (e.g. linear regression) this model would have high bias as a result of the model not being able of grasp the complexity of the problem.

The main reason for high variance is overfitting on the training set.

That being said there are ways of reducing both bias and variance on a ML model. For example the easiest way of achieving this is getting more data (in some cases even synthetic data help).

What we tend to do in practice is:

  • First, we increase the capacity of the model in order to reduce the variance on the training set as much as possible. In other words we want to make the model overfit (even reach a loss of 0 on the training set). This is done because we want to make sure the model has the capacity of sufficiently understanding the data.

  • Then we try to reduce the bias. This is done through regularization (early stopping, norm penalties, dropout, etc.)

Djib2011

Posted 2018-04-12T20:19:53.473

Reputation: 6 495

1Just to be clear, more data doesn't mean exclusively more examples, but could be more features for the current examples, right? – Zer0k – 2018-04-12T23:18:09.313

4Well actually I meant more examples, but you are correct if you could measure more (meaningful) features for the current examples you would most certainly improve your model's performance. – Djib2011 – 2018-04-12T23:22:27.737