What is the "dropout" technique?



What purpose does the "dropout" method serve and how does it improve the overall performance of the neural network?


Posted 2016-08-02T16:08:23.377

Reputation: 9 163



Dropout means that every individual data point is only used to fit a random subset of the neurons. This is done to make the neural network more like an ensemble model.

That is, just as a random forest is averaging together the results of many individual decision trees, you can see a neural network trained using dropout as averaging together the results of many individual neural networks (with 'results' understood to mean activations at every layer, rather than just the output layer).

Matthew Graves

Posted 2016-08-02T16:08:23.377

Reputation: 3 957


The original paper1 that proposed neural network dropout is titled: Dropout: A simple way to prevent neural networks from overfitting. That tittle pretty much explains in one sentence what Dropout does. Dropout works by randomly selecting and removing neurons in a neural network during the training phase. Note that dropout is not applied during testing and that the resulting network doesn't dropout as part of predicting.

This random removal/dropout of neurons prevents excessive co-adaption of the neurons and in so doing, reduce the likelihood of the network overfiting.

The random removal of neurons during training also means that at any point in time, only a portion of the original network is trained. This has the effect that you end up sort of training multiple sub-networks, for example:

droup as an ensembler

It is from this repeated training of sub-networks as opposed to the entire network where the notion of neural network dropout being a sort of ensemble technique comes in. I.e the training of the sub-networks is similar to training numerous, relatively weak algorithms/models and combining them to form one algorithm that is more powerful than the individual parts.


1: Srivastava, Nitish, et al. "Dropout: A simple way to prevent neural networks from overfitting." The Journal of Machine Learning Research 15.1 (2014): 1929-1958.

Tshilidzi Mudau

Posted 2016-08-02T16:08:23.377

Reputation: 744

"Dropout works by randomly selecting and removing neurons in a neural network". Really, only the fully-connected part of a neural network. – Monica Heddneck – 2019-01-05T22:36:39.773


I'll try to answer your questions using Geoffrey Hinton's ideas in dropout paper and his Coursera class.

What purpose does the "dropout" method serve?

Deep neural nets with a large number of parameters are very powerful machine learning systems. However, overfitting is a serious problem in such networks. Large networks are also slow to use, making it difficult to deal with overfitting by combining the predictions of many different large neural nets at test time. Dropout is a technique for addressing this problem.

so it's a regularization technique which addresses the problem of overfitting(high variance).

How does it improve the overall performance?
by better generalization and not fall in trap of over fitting.

Iman Mirzadeh

Posted 2016-08-02T16:08:23.377

Reputation: 121


There are some great answers here. The simplest explanation I can give for dropout is that it randomly excludes some neurons and their connections from the network, while training, to stop neurons from "co-adapting" too much. It has the effect of making each neuron apply more generally and is excellent for stopping overfitting for large neural networks.


Posted 2016-08-02T16:08:23.377

Reputation: 131