Is it valuable to normalize/rescale labels in neural network regression?


Have there been any papers, or does anyone have any specific experience to know whether normalizing labels in a regression problem is likely to improve the performance of a neural network? I have labels that are in the range (0,1000) applying square loss in a ConvNet. I want to know if it might be useful to normalize these to a (0,1) range, or whether that's known to not matter.


Posted 2017-09-01T16:36:58.420

Reputation: 363



Yes, you should do this. Given the initialization schemes and normalized inputs, the expected values for the outputs are 0. This means that you will not be too far off from the start, which helps convergence. If your target is 1000, your mean squared error will be huge which means your gradients will also be huge which can lead to numerical instabiliy.

Jan van der Vegt

Posted 2017-09-01T16:36:58.420

Reputation: 8 538

1I was perplexed about why I was getting NaN's on a regression model with labels in range (100, 300). Scaling my labels by dividing them by 300 did the trick. – rodrigo-silveira – 2018-03-07T01:04:28.963