8

2

I am reading this paper "Sequence to Sequence Learning with Neural Networks" http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf

Under "2. The Model" it says:

The LSTM computes this conditional probability by first obtaining the fixed dimensional representation v of the input sequence (x1, . . . , xT ) given by the last hidden state of the LSTM, and then computing the probability of y1, . . . ,yT′ with a standard

LSTM-LMformulation whose initial hidden state is set to the representation v of x1, . . . , xT:

I know what an LSTM is, but what's an LSTM-LM? I've tried Googling it but can't find any good leads.

But this sentence is still puzzling to me. if I put it into equation if makes [ with c the last hidden state of the encoder. then the first hidden state represents the information provided by the encoder but the next ones represent the probability distribution of the target sequence's elements : something of a radically different nature. Also the cell state state initialisation is not given and the figure 1 let believe that the LSTM provid

– Charles Englebert – 2018-09-18T12:40:01.363