Difference: Replicator Neural Network vs. Autoencoder



I'm currently studying papers about outlier detection using RNN's (Replicator Neural Networks) and wonder what is the particular difference to Autoencoders? RNN's seem to be treaded for many as the holy grail of outlier/anomaly detection, however the idea seems to be pretty old to, as autoencoders have been there for a long while.


Posted 2016-06-15T18:59:58.047

Reputation: 265

Hi. I was just about deleting it, as I've read here: http://meta.stackexchange.com/a/254090 that datascience ist the right forum for this question. Sorry for the delay.

– Nex – 2016-06-15T19:10:44.763

OK. I only noticed because I'd never heard of Replicator NNs and searched - the cross-validated question came up. I would agree that Data Science is better place for this question. – Neil Slater – 2016-06-15T19:17:08.590



Both types of networks try to reconstruct the input after feeding it through some kind of compression / decompression mechanism. For outlier detection the reconstruction error between input and output is measured - outliers are expected to have a higher reconstruction error.

The main difference seems to be the way how the input is compressed:

Plain autoencoders squeeze the input through a hidden layer that has fewer neurons than the input/output layers.. that way the network has to learn a compressed representation of the data.

Replicator neural networks squeeze the data through a hidden layer that uses a staircase-like activation function. The staircase-like activation function makes the network compress the data by assigning it to a certain number of clusters (depending on the number of neurons and number of steps).

staircase-like activation function

From Replicator Neural Networks for Outlier Modeling in Segmental Speech Recognition:

RNNs were originally introduced in the field of data compression [5]. Hawkins et al. proposed it for outlier modeling [4]. In both papers a 5-layer structure is recommended, with a linear output layer and a special staircase-like activation function in the middle layer (see Fig. 2). The role of this activation function is to quantize the vector of middle hidden layer outputs into grid points and so arrange the data points into a number of clusters.


Posted 2016-06-15T18:59:58.047

Reputation: 1 536