4

There are 2 parts to this question. Suppose we are looking at sales $S$ of a product across $> 1000$ stores where a it sells. For each of these $1000$ stores we have 24 months recorded data.

- We want to be able to predict $S_t \leftarrow f(S_{t-1})$. We could build a RNN for each of the store time series, calculate test RMSE and then take an average after taking care of normalizing values etc. But the problem is there are very few samples per time series. If we were to segment stores into groups (by say Dynamic Time Warping) then could we create a monologue of text sentiment mining where like in text two sentences are separated by a dot here we would have two time series separated by a special symbol (let's say). In that case, we would generate a RNN model on

$$Train_1 | Train_2 |\ldots|Train_t$$

data and predict on

$$Test_1 | Test_2 |\ldots|Test_t$$

- After this, we would like to set it up as a panel data problem where $S_t\leftarrow f(x_{t1},x_{t2},\ldots,x_{tn})$. In that case should I build a separate neural network for each $t$ and then connect the hidden layers from $t \rightarrow t+1 \rightarrow t+2 \ldots$.

How should I implement these through packages like Keras/Theano/Mxnet etc.? Any help would be great!

4Links can be turn to broken in the future. You should avoid giving answers with just a link. It will be better if you give a summary of the answer and then add the source. – Tasos – 2017-11-22T15:41:00.347