I'm using keras for multiple-step ahead time series forecasting of a univariate time series of type float. Judging by the results I got, the approach works works perfectly well. There is, however, an important detail in the training process that baffles me:
keras requires the sequence length of the input sequences (X matrix) to be equal to the forecasting horizon (y matrix). That means, for example, that keras needs input sequences of length 20 in order to forecast the next 20 time steps. My goal is to be able to forecast as many time steps as I specify, given the last 20 time steps. With the below code, this is not possible:
import numpy as np from keras.layers.core import Dense, Activation, Dropout, TimeDistributedDense from keras.layers.recurrent import LSTM from keras.models import Sequential from keras.optimizers import RMSprop np.random.seed(1234) model = Sequential() layers = [1, 20, 40, 1] model.add(LSTM( input_dim=layers, output_dim=layers, return_sequences=True)) model.add(Dropout(0.3)) model.add(LSTM( layers, return_sequences=True)) model.add(Dropout(0.3)) model.add(TimeDistributedDense( output_dim=layers)) model.add(Activation("linear")) rms = RMSprop(lr=0.001) model.compile(loss="mse", optimizer=rms) model.fit( X_train, y_train, batch_size=512, nb_epoch=100, validation_split=0.1)
I get a dimension error whenever the sequence lengths of the sequences in X_train and in y_train differ from one another.
What am I doing wrong and how can I fix it? Why are my results still pretty good?
The question is related to this.