In a sequence to sequence model, a lot of the tutorials I have read state that the decoder target length should be the same as the encoder input length (https://blog.keras.io/building-autoencoders-in-keras.html, https://blog.keras.io/a-ten-minute-introduction-to-sequence-to-sequence-learning-in-keras.html). Often another separate model is built to be used for inference. How important is it for the accuracy of the model that the decoder target length is actually the same as the encoder input length?
In my setup I have many variable length sequences which I have applied a mask to so that any sequences that have missing values or are simply shorter than the longest sequence length are padded to the length of the longest sequence in the dataset and the padded data are filled with a mask value. Using the masking layer in keras these masked points get ignored in the update of the weights and loss.
However this means that if I make the decoder target length the same as the encoder input length, it will try and reproduce a shorter series that has been masked and make it the actual input length which could be much greater. I am not sure if this would cause a problem or not?
I therefore wonder if it would be better to perhaps make the decoder target length and make it the mode of the series I have? Or if infact if I know that I always want to predict H steps into the future, why not make the initial model being trained have a decoder target length of H instead of the same length as the encoder input?