Predicting time series, with few historical samples based on similar series

4

3

I'm trying to build a model with Keras to predict the time series of a sensor, based on its type and historic data of sensors of the same type.

The figure below shows 3 time series, generated from 3 sensors of the same type, the green dashed line is the new sensor data and the vertical line is where the data for the new sensor end.

enter image description here

I've tried writing an LSTM network, that returns the hidden state output for each input time step, while the target was the values for each timestamp. Then trying to predict the new time series giving the model a few points of the sensor history data. With no luck :(

So I'm guessing I'm walking on the wrong path. What are the options of predicting a time series with just a few historical samples based on the history of other time series of the same type?

Any help / reference / video would be appericiated

Shlomi Schwartz

Posted 2020-04-28T15:41:36.640

Reputation: 41

You can predict growth based on time since start and other parameters. Then you could apply growths to every point in time and get a curve. – keiv.fly – 2020-04-28T16:24:47.160

What is the size of your dataset? – Leevo – 2020-05-06T09:38:19.647

2What do you mean with "With no luck"? Did you get bad accuracy? How bad? Did your model overfit? Did you fail to implement the model? – noe – 2020-05-06T13:07:55.080

Answers

2

I'd suggest statistical forecasting techniques such as ARIMA or ES, but those models usually cannot generalize well across timeseries, so you'd need one for each timeseries

A good starter for using LSTM for forecasting is here - https://www.tensorflow.org/tutorials/structured_data/time_series. But if you don't have enough data, NNs would like result in poor test results.

For your case, I'd suggest you trying a regression approach- structuring your timeseries into a regression features format. After that you can use regression models from sklearn, starting with linear and moving to more complex ones. Because you have less data, you might want to explore less complex models first to prevent overfitting. For the features, for example, you could create lag features (what was the value of your signal two timestep back, what was the mean/std of the past 2 timesteps, what was the maximum in the last 6 timesteps and so on). Look up any kaggle competition on timeseries forecasting if you want to refer code or get specific ideas on feature extraction

Sid

Posted 2020-04-28T15:41:36.640

Reputation: 577

0

Accurate time series predications might not be possible for the data.

The data appears to be non-stationary (i.e., changes distributional properties) at the vertical line. Training before that time point might not yield accurate predictions for time points after.

Additionally, the yellow, blue, and red lines are extremely noisy after the vertical bar (possibly even being a random walk).

Brian Spiering

Posted 2020-04-28T15:41:36.640

Reputation: 10 864