I'm training a neural network on 'easy' dataset with ~15k examples. Network overfits pretty fast.
I've checked if there is a lot of predictions around probability ~0.5, but it is not:
Also, there is a plot of percent of correct predictions based on prediction probability. There is some pattern here, but the number of elements is quite small to make conclusions.
So, my question is: why can it happen, and what to do about it?