I have a dataset consisting of recurring and non-recurring expense transactions from bank accounts, as well as other features describing the bank account and each transation. I aggregate these transactions per account by summing all transactions per day so that I have a total expense amount per day. In other words, for each customer I have a time series for each account spanning several years. Each time series consists of daily expense amounts (and other features about this day and the account).
I now need to forecast the expense amount for an account for the each of the next 7 days. My initial attempt was to use a sequence of single features of the daily expense amount as input into an LSTM. I would predict value t+1, and then use this as the most recent input to predict t+2. I repeat this 7 times to get a 7 day forecast.
The results seem ok for the most part, but not great. I would like to explore more complex modelling to see if I can improve on this initial benchmark model.
This is the best experiment I have come across:
Would an encoder-decoder seq2seq LSTM wuit this problem? What about a CNN-LSTM or ConvLSTM? I also would like to use some of the additional input features of the data to enrich the data used to predict the following 7 day expense values. This would now be a multivariate, multistep forecast.