What is the purpose of the noise injection in the generator network of a GAN?


I do not understand why with enough training how the generator cannot learn all images from the training set as a mapping from the latent space - It is the absolute optimal case in training as it replicates the distribution and the discriminator output will always be 0.5. Even though most blog posts I have seen do not mention noise, a few of them have them in their diagrams or describe their presence, but never exactly describe the purpose of this noise.

Is this noise injected to avoid the exact reproduction of the training data? If not what is the purpose of this injection and how is exact reproduction avoided?


Posted 2019-07-24T11:52:05.077

Reputation: 1 194

where is noise "injected"? or do you mean how noise is used as an input – mshlis – 2019-07-24T12:15:56.040

@mshlis of the generator networks I have come across the input vector was always the latent vector z , whenever noise was injected it was usually added before the convolution operations – ashenoy – 2019-07-24T12:18:54.127

generally z is a noise vector (sometime amended by some feature vecdtor) – mshlis – 2019-07-24T12:19:55.930



Your goal is to model a distribution when constructing a GAN, therefore you need a way to be able to sample that distribution. The noise's purpose is so you can do this. Generally, it's drawn from a distribution that is computationally easy to draw from (like a gaussian).

You are modeling the generator $G(X)$ where $X \sim N(\mu, \sigma^2)$. this means $G(X)$ is a random variable itself. The forward pass of the network transforms the $X$ samples into our $G(X)$ samples, allowing us to formulate a loss function (by solving the expectation as the mean of drawn samples) and train the model.

Takeaway: The noise injected is just a parametrization of our Generator in another space, and the training goal is to learn the ideal transformation (we use neural networks because they are differentiable and are effective function approximators)

Also, note to your point of why it doesn't learn the training data (in its entirety) exactly is because generally $G(X)$ is continuous, and therefore if it has 2 images in its codomain, there also exists some path in pixel space from one to the other containing an uncountable (or quantized countable) number of images that don't exist in the training, and this would be reflected in the loss, therefore in the min-max game of the optimization, its difficult for it learn the training set on the nose.


Posted 2019-07-24T11:52:05.077

Reputation: 1 845