When to use Stateful LSTM?



I'm trying to use LSTM on time-series data in order to generate future sequences that looks like the original sequences in term of values and progression direction. My approach is:

  • train RNN to predict a value based on 25 past values then use the model to recursively generate future predictions by appending the predicted values to the original sequence and shift the old values ...

Playing with LSTM cells, I found that the model is not able to learn to generate sequences that looks like the original data. It only predict next value then start converging to an 'equilibrium' or static value which is the same whatever the input sequence is.

I'm wondering if Stateful LSTM would help to learn better from past values and try to predict something close to what it have already? My goal here is to generate sequences that looks like something that the model have seen already.

Please let me know if I'm missing something or if you had a similar situation and you found the best approach to generate timeseries sequences that looks like what the model learned in the past.


Posted 2018-01-15T20:51:20.910

Reputation: 358

I don't have an answer but would also like to know this... My results also seem to predict on an equilibrium when predicting multiple steps into the future. Did you succeed in using a stateful LSTM for this? Ps. Can't comment due to new account w/ no rep – repoleved – 2018-03-31T23:48:09.050

I still have the issue with stateful LSTM – Hastu – 2018-04-02T01:11:31.990

I'm in the same boat. I have read that Mixture Density Networks solve this problem... – duhaime – 2018-10-19T22:09:20.613



As for stateful LSTM and its understanding, refer to here. Quoting an answer from there:

"I’m given a big sequence (e.g. Time Series) and I split it into smaller sequences to construct my input matrix X. Is it possible that the LSTM may find dependencies between the sequences?

No it’s not possible unless you go for the stateful LSTM. Most of the problems can be solved with stateless LSTM so if you go for the stateful mode, make sure you really need it. In stateless mode, long term memory does not mean that the LSTM will remember the content of the previous batches."

Therefore, stateful is useful if you wish to save the state of the neurons for the next training session instead of resetting it.

"My goal here is to generate sequences that looks like something that the model have seen already."

Then maybe this sequence2sequence (seq2seq) LSTM encoder-decoder is exactly what you need.


Posted 2018-01-15T20:51:20.910

Reputation: 3 275


See this post: https://stackoverflow.com/questions/47594861/predicting-a-multiple-time-step-forward-of-a-time-series-using-lstm

You should be training on a shifted X for your Y. Then passing through the new weights.h5 and predicting as shown.


Posted 2018-01-15T20:51:20.910

Reputation: 149