Understanding autoencoder loss function


I've never understood how to calculate an autoencoder loss function because the prediction has many dimensions, and I always thought that a loss function had to output a single number / scalar estimate for a given record. However, on GitHub I recently came across a repository that has implemented an example autoencoder in TensorFlow, and the squared error implementation is a lot simpler than I thought:

cost = tf.reduce_mean(tf.pow(y_true - y_pred, 2))

The TensorFlow documentation on reduce_mean says, among other things:

If axis has no entries, all dimensions are reduced, and a tensor with a single element is returned.

Can I conclude from all of this that the squared error of an autoencoder prediction is just the average across all of the record's dimensions?

Ryan Zotti

Posted 2016-12-06T22:43:19.713

Reputation: 3 849



Yes, you are correct in thinking that the squared error of an autoencoder prediction for a single example is the average of the squared error of the prediction for all dimensions. Similarly, the squared error for a whole batch of examples will be the average of the error of each example in the batch.


Posted 2016-12-06T22:43:19.713

Reputation: 3 345