67

64

The VC-dimension formula for neural networks ranges from $O(E)$ to $O(E^2)$, with $O(E^2V^2)$ in the worst case, where $E$ is the number of edges and $V$ is the number of nodes. The number of training samples needed to have a strong guarantee of generalization is linear with the VC-dimension.

This means that for a network with billions of edges, as in the case of successful deep learning models, the training dataset needs billions of training samples in the best case, to quadrillions in the worst case. The largest training sets currently have about a hundred billion samples. Since there is not enough training data, it is unlikely deep learning models are generalizing. Instead, they are overfitting the training data. This means the models will not perform well on data that is dissimilar to the training data, which is an undesirable property for machine learning.

Given the inability of deep learning to generalize, according to VC dimensional analysis, why are deep learning results so hyped? Merely having a high accuracy on some dataset does not mean much in itself. Is there something special about deep learning architectures that reduces the VC-dimension significantly?

If you do not think the VC-dimension analysis is relevant, please provide evidence/explanation that deep learning is generalizing and is not overfitting. I.e. does it have good recall AND precision, or just good recall? 100% recall is trivial to achieve, as is 100% precision. Getting both close to 100% is very difficult.

As a contrary example, here is evidence that deep learning is overfitting. An overfit model is easy to fool since it has incorporated deterministic/stochastic noise. See the following image for an example of overfitting.

Also, see lower ranked answers to this question to understand the problems with an overfit model despite good accuracy on test data.

Some have responded that regularization solves the problem of a large VC dimension. See this question for further discussion.

Comments are not for extended discussion; this conversation has been moved to chat.

– D.W. – 2017-05-15T03:37:15.1207I don't think questions why is something "hyped" are good. The answer is "because people". People take interest in things because of plethora of reasons, including marketing. – luk32 – 2017-05-16T13:49:26.787