Recommended model for univariate or multivariate multistep ahead time series forecasting



I have a dataset consisting of recurring and non-recurring expense transactions from bank accounts, as well as other features describing the bank account and each transation. I aggregate these transactions per account by summing all transactions per day so that I have a total expense amount per day. In other words, for each customer I have a time series for each account spanning several years. Each time series consists of daily expense amounts (and other features about this day and the account).

I now need to forecast the expense amount for an account for the each of the next 7 days. My initial attempt was to use a sequence of single features of the daily expense amount as input into an LSTM. I would predict value t+1, and then use this as the most recent input to predict t+2. I repeat this 7 times to get a 7 day forecast.

The results seem ok for the most part, but not great. I would like to explore more complex modelling to see if I can improve on this initial benchmark model.

This is the best experiment I have come across:

Would an encoder-decoder seq2seq LSTM wuit this problem? What about a CNN-LSTM or ConvLSTM? I also would like to use some of the additional input features of the data to enrich the data used to predict the following 7 day expense values. This would now be a multivariate, multistep forecast.


Posted 2018-12-19T11:59:44.973

Reputation: 189

I'd recommend Conv2D - LSTM which I think suits best for this kind of time-series problems, since you want time distributed prediction (7-day sequential forecast) which this structure needs by its nature. Moreover, you can make your LSTM bidirectional, which will allow it to learn sequentially as both forward and backwards way, altough it can be computationally expensive combined with Conv2D or ConvD. – Ugur MULUK – 2018-12-19T13:42:38.957

I am not too familiar with bidirectional LSTMs, but my understanding was that they use both previous and following data from the sequence to best approximate the current value. If I am using my model in inference today, and I am attempting to forecast the expense amounts for the coming 7 days, would a bidirectional LSTM not need access to the expense data following the next 7 days? (which of course is not possible as these days are in the future) – KOB – 2018-12-19T13:48:47.327

Let's have a look what classical LSTM with one layer does at forward propagation: It takes the first part of the sequence, does a prediction, feeds the prediction forward. Second one takes the prediction and the second part of the sequence, does a prediction and goes on until the last recurrent neuron. Good part: neurons feed from the previous predictions of the neurons. Bad part: first neuron did not even get any prediction. At bi-directional LSTM, you do the same thing from the last recurrent neuron to the first; you do not need to know the future actually. – Ugur MULUK – 2018-12-19T14:03:54.993

Hi @UgurMULUK I am still having trouble understanding and visualizing how the architecture of the Conv2D-LSTM would work with my data. Perhaps we could speak in private? – KOB – 2018-12-20T09:06:33.730

Have a look at and if not enogh you can take me to private, I can send an example from my codes, trying to explain it.

– Ugur MULUK – 2018-12-20T12:54:38.180

@KOB, did you finally understand the architecture of ConvLSTM? I have a data for time series classification. Can I also use it in my model? just like LSTM? or is ConvLSTM only for 2D data? I will be using Conv1D as well in my model. – Faaiz Qadri – 2020-08-20T08:28:28.587

No answers