Which type auto encoder gives best results for text


I did I couple of examples for auto encoders for images and they worked fine. Now I want to do an auto encoder for text that takes as input a sentence and returns the same sentence. But when I try to use the same auto encoders as the ones I used for the images I get bad results.

I guess the reason for this is that my text is sparse and I have a big vocabulary size of 500K words.

  1. Do you have a link of a working example of an auto encoder for text in Keras?

  2. I saw that in most papers they use cross-entropy as a loss function. How does cross-entropy calculate the loss exactly ? Does it make sense to use cross-entropy even if I do a character by character auto encoder?


Posted 2018-03-25T20:43:33.720

Reputation: 99



A working example of a Variational Autoencoder for Text Generation in Keras can be found here.

Cross-entropy loss, aka log loss, measures the performance of a model whose output is a probability value between 0 and 1 for classification. Cross-entropy loss goes up as the predicted probability diverges from the actual label. In the case of character-by-character autoencoder, each character in the vocabulary would be a label.

Cross-entropy works if the input and output are the same size, that is the case in character-by-character autoencoder. Often times in text analysis, the input and output sequences are different lengths so a second term, encoder loss, is added to the objective function.

Brian Spiering

Posted 2018-03-25T20:43:33.720

Reputation: 10 864

Hi so I tried the code you suggested. I can generate text but I don't understand how the autoencoder works. What I want to do is to give some text as input and have the same text as output. But in this case when I do: pred = vae.predict(test, batch_size=500) pred contains only 1s. So it doesn't make sense. Am I doing something wrong ? – sspp – 2018-03-29T19:07:52.877

The Keras blog has a post blog on understanding autoencoders - https://blog.keras.io/building-autoencoders-in-keras.html

– Brian Spiering – 2018-03-29T20:32:52.870

Yes, I already saw it but the problem is that I have text as input. I tried some of the methods and I have very bad results for text. That is why asked if I'm using the right loss function, because on the blog post they use binary cross entropy because the images are black and white – sspp – 2018-03-29T20:40:36.167