From Ian Goodfellow's Deep Learning Book:
If an autoencoder succeeds in simply learning to set
g(f(x)) = xeverywhere, then it is not especially useful. Instead, autoencoders are designed to be unable to learn to copy perfectly
I don't understand this part.
g is the decoder, and
f is the encoder. Why is it undesirable for the encoder and decoder to perfectly represent the input data
Another way to frame this question is - why do autoencoders require regularization? I understand in predictive machine learning, we regularize the model so that it can generalize beyond the training data.
However, with a sufficiently massive training set (as is common in Deep Learning), there should not be a need for regularization. To me, it seems desirable to learn
g(f(x)) = x everywhere, and I don't understand why the author says otherwise.