I am using a stock auto-encoder anomaly detector from Deeplearning4j.
I was getting unexpected results from my own variant of the auto-encoder, which looks for anomalies in my own (non-image) data, and to try and investigate, I added some additional images, 10 white and 10 black to the MNIST test data, to see the effect. See the output picture below:
What was unexpected was that those black images (all zero input values) have zero reconstruction error (the left column in the "best" picture), despite the auto-encoder being trained only on stereotypical digits (no black images), while the white images(i.e. all 1 input values) have the highest reconstruction error (the left column on the "worst" image). Given the generally accepted definition of an anomaly detector, I would have expected both all-black and all-white to have high reconstruction error, as neither appeared in the training set, and hence should be anomalous.
This is consistent with the unexpected output from my own variant of the auto-encoder.
I suspect there is a data science reason why this is the case with an auto-encoder, but can someone shed light, or point me to references why?