From Ian Goodfellow's Deep Learning Book:

If an autoencoder succeeds in simply learning to set

`g(f(x)) = x`

everywhere, then it is not especially useful. Instead, autoencoders are designed to be unable to learn to copy perfectly

I don't understand this part. `g`

is the decoder, and `f`

is the encoder. Why is it undesirable for the encoder and decoder to perfectly represent the input data `x`

?

Another way to frame this question is - why do autoencoders require regularization? I understand in predictive machine learning, we regularize the model so that it can generalize beyond the training data.

However, with a sufficiently massive training set (as is common in Deep Learning), there should not be a need for regularization. To me, it seems desirable to learn `g(f(x)) = x`

everywhere, and I don't understand why the author says otherwise.

