Why are autoencoders for dimension reduction symmetrical?



I'm not an expert in autoencoders or neural networks by any means, so forgive me if this is a silly question.

For the purpose of dimension reduction or visualizing clusters in high dimensional data, we can use an autoencoder to create a (lossy) 2 dimensional representation by inspecting the output of the network layer with 2 nodes. For example, with the following architecture, we would inspect the output of the third layer

$[X] \rightarrow N_1=100 \rightarrow N_2=25 \rightarrow (N_3=2) \rightarrow N_4=25 \rightarrow N_5=100 \rightarrow [X]$

where $X$ is the input data and $N_l$ is the number of nodes in the $l$th layer.

Now, my question is, why do we want a symmetrical architecture? Doesn't a mirror of the deep 'compression' phase mean we might have a similarly complex 'decompression' phase resulting in a 2 node output which is not forced to be very intuitive? In other words, wouldn't having a simpler decoding phase result in the output of the layer with 2 nodes necessarily being simpler too?

My thinking here is that the less complex the decompression phase, the simpler (more linear?) the 2D representation has to be. A more complex decompression phase would allow a more complex 2D representation.


Posted 2017-10-13T05:25:23.793

Reputation: 291



There is no specific constraint on the symmetry of an autoencoder.

At the beginning, people tended to enforce such symmetry to the maximum: not only the layers were symmetrical, but also the weights of the layers in the encoder and decoder where shared. This is not a requirement, but it allows to use certain loss functions (i.e. RBM score matching) and can act as regularization, as you effectively reduce by half the number of parameters to optimize. Nowadays, however, I think no one imposes encoder-decoder weight sharing.

About architectural symmetry, it is common to find the same number of layers, the same type of layers and the same layer sizes in encoder and decoder, but there is no need for that.

For instance, in convolutional autoencoders, in the past it was very common to find convolutional layers in the encoder and deconvolutional layers in the decoder, but now you normally see upsampling layers in the decoder because they have less artefacts problems.


Posted 2017-10-13T05:25:23.793

Reputation: 10 494


Your question is definitely in place, however I found that any question in the format of "should I do X or Y in deep learning ?" has only one answer.

Try them both

Deep learning is a very empirical field, and if a non-symmetric auto-encoder works for your domain, then use it (and publish a paper)

Ankit Suri

Posted 2017-10-13T05:25:23.793

Reputation: 106


I did some extensive experiment to address to asked question. My experiments indicated that the encoding path ( left leg of the NN ) should have less but wider layers. I usually take half so much layers, but doubled number of nodes for the encoding path. I have no explanation for this, just these configuration often led to faster convergence.


Posted 2017-10-13T05:25:23.793

Reputation: 31