Forecasting Multiple (few hundreds) uni-variate time series with inflated zeros



Hello Practitioners,

Being a newbie seeking help to gain experience in Data Science.

Lets take a scenario where a big company wants to forecast its sales (a specific product) across different stores in different geographic locations. As an Analyst, a task is given to forecast few hundreds of series (sales) for next 3 months. Since, we are forecasting sales across different geographic locations, the nature of the series would not be same for all. There would hundreds of models to check with. What are the suggested approaches for this scenario with your experience in this field? Also, how important it is know the nature of each series in this scenario?


Posted 2018-03-23T08:57:42.450

Reputation: 81

Most of the time a LSTM will outperform other solutions, but if you have the time, it is always good to try others such as a moving average model like ARIMA for comparison. If you think the underlying Time Series has a seasonal trend, then ARIMA may not work for you. LSTM is more flexible and will work well with seasonal and trending datasets. – Donald S – 2020-06-14T13:05:26.417



  1. I can suggest auto.arima function from forecast library, if you are R user, if you are Python user then follow this link. All you need is to write simple for loop, which allows you to built best ARIMA models in different geographic locations:

       for (my_time_series in set_of_all_time_series){
            model=auto.arima(my_time_series )
  2. You can cluster your time series by correlation (make sure that your time series are stationary to avoid spurious correlation). If this reduces the number of time series (which will depend on threshold on correlation), you can take any 1 member from each class, build any model (not only ARIMA) and apply model results on each member of that class.

  3. Contract VAR model


Posted 2018-03-23T08:57:42.450

Reputation: 11


what are the suggested approaches for this scenario with you?
experience in this field?

Another very popular approach (apart from @user112358 suggestion) is to use neural networks, particularly LSTM-RNN because of their inherent "memory" capabilities. Recurrent Neural Networks are a very good candidate when dealing with time series, such as forecasting product sales, because they are the only variant of neural networks that can model the dynamics of a system.

A very informative tutorial on how to rapidly prototype such an algorithm can be found here, targeting the Keras API using Python. I highly recommend that you check it because it personally helped me a lot. It is also applied to a shampoo sales dataset, which is exactly what you are looking for as a case study.


Posted 2018-03-23T08:57:42.450

Reputation: 3 275