Deploying an LSTM Model



I have trained and validated my LSTM and I would like to deploy it. So, I know that we can save and load the Sequential object of Keras (I am working with Keras as you can guess). I thus implemented a code using these functions.

But, I would like to know whether I must train my model with all the available data (training + test) or just on the training set as I did during my study. Many tutorials talk about how to train a model but not so many are clear about how to deploy a model.

I would like to know what is common to do, knowing that I am doing Time Series Forecasting which is a specific problem.


Posted 2019-08-06T12:32:17.430

Reputation: 97



I have spent the past 16 years on Wall Street. Everything we do is based on time series data. You really should re-train the model as new data becomes available, or predicted results and actual results will quickly start to diverge. Also, you will probably use all data for training (you may not have enough data to set some aside for testing purposes). Check out the link below for more insight into how all of this works.

3 facts about time series forecasting that surprise experienced machine learning practitioners

In terms of deploying your model, it should be pretty straightforward. I did this just 1 day ago. I followed the instructions from the link below. Try that, and see how you get along.

Deploying scikit-learn Models at Scale | towardsdatascience


Posted 2019-08-06T12:32:17.430

Reputation: 341


Personally, I would side with deploy it as is. I.e. do not retrain on all the data.

Once you train on new data, then you have a new model. You have no idea how this model is going to react to unseen test data coming in from the big wide world. You can’t validate or test it, because you’ve used all that data up in training.

Who knows what might happen!

As you receive more data and get feedback on the model you can create new train/test sets, fine tune the model, then redeploy it as required.


Posted 2019-08-06T12:32:17.430

Reputation: 176

Glad I could help! I’ve edited the answer slightly to highlight how I would update the model later on. Don’t recreate a new train/test set from scratch, just use new data to help fine tune the model. – dijksterhuis – 2019-08-06T13:40:10.637

Oh thanks. Now, a quick related question. Can I try to automate this process with callbacks functions. For instance, let's assume that I deployed a given model for a monthly-sampled time serie. Assuming it works after 3 months in the production environment, can I re-train it automatically with a script using callback functions (early stopping, model checkpoints and so on) so that I do not need to work on this as seriously as I did before its deployment ? It would assume a static architecture (layers, dropout) but does it make sense to do so ? Thanks – kakarotto – 2019-08-06T13:57:41.553

I would strongly advise that you make sure any changes are verified and looked at by a human before it is deployed, as it’s always good to double check it’s not gone haywire. But yes, automation sounds like a sensible idea to reduce your ongoing workload. Checkpoints are good bet to keep a static version for each month, for example. Then you can always rollback to last month if any issues. – dijksterhuis – 2019-08-06T14:10:04.023

Yes, that makes me aware of the necessity to document correctly what is done so that someone else than me can check and double check things even if some processes are automatized. Thanks again :) hope people will find your answer and upvote it. – kakarotto – 2019-08-06T15:31:17.787


A) Regarding training your model with what data, you always want to train your model on training data and keep aside some data to evaluate the performance on unseen data(test data in this case).

B) Regarding deployment of a timeseries model, you need to have a pipeline prepared where in 1) you can deploy your packaged model file (can be a pickled file, pmml, onnx, h2o packaged file etc.) on any preferred platform(cloud/local machine), 2) Get an API endpoint of the deployed instance to which you can send prediction requests, 3) Monitor the responses and calculate error metrics(RMSE/MAPE etc.) with your real and predicted data that helps you keep track of your model performance 4) It is important to retrain your model when ever there is new data available especially in the case of timeseries as future is unpredictable and you need to keep account of trend and seasonality. Make sure your deployed model is always trained on the recent data available.

The above is the deployment pipeline for a timeseries model you need to have to build a better machine learning system.

Sandeep Dharmavarapu

Posted 2019-08-06T12:32:17.430

Reputation: 11