Multiple-output vs single-output NNs


I'm trying to build a 5 input-5 output model using LSTM, where all the outputs are the same features as the inputs, predicted in the future.

My question is: is it better to build 5 models, each with the same 5 inputs, but predicting just 1 of the 5 sequences at a time, or is it the same as building 1 model predicting all 5 sequences? In other words, is the accuracy per predicted sequence going to be higher with 5 separate models or will it be the same as 1 model with 5 outputs.

The reason for my confusion is that, in the case of the multiple output model, the hidden layer will be the same; so how would the algorithm go about optimizing the weights so as to minimize error for all output sequences?


Posted 2018-09-18T21:33:31.160

Reputation: 101



In general, you cannot know which is better until you try both with lots of data and evaluate them with a strong statistical test. I know you won't like this answer but this is how neural networks work. Hope this helped


Posted 2018-09-18T21:33:31.160

Reputation: 149


I assume your situation is like this: You receive the first input, and try to predict the second. Then, you receive the true second input and try to predict the third, using both the first and second, and so on..

I think the approach of training 5 neural networks is the most over complicated since you would have to do everything 5 times, build 5 train sets and test sets, train 5 networks and then evaluate 5 networks.

About using RNN, with each input you feed to the network it's hidden state will change, i.e. after you enter the second input the hidden state will change from the previous state, so the output will be different. Regarding the optimization, Theano and PyTorch include implementations of RNN so you don't have to worry about it.

Finally, since you have a maximum size for the input, maybe you could try training a regular NN with an input size of five times the size of your inputs, and use paddings (like zeros or another value that could not be present in the real inputs) to fill the input when it does not contain the five elements.

Hope this can help, but as the other solution said, you can not know which solution will work better, since it highly depends on your data, but there are solutions that can be faster and easier to implement.

Sebastian Amenabar

Posted 2018-09-18T21:33:31.160

Reputation: 66