Classify multivariate time series



I have a set of data composed of time series (8 points) with about 40 dimensions (so each time series is 8 by 40). The corresponding ouput (the possible outcomes for the categories ) is eitheir 0 or 1.

What would be the best approach to design a classifier for time series with multiple dimensions ?

My initial strategy was to extract features from those time series : mean, std, maximum variation for each dimension. I obtained a dataset which I used to train a RandomTreeForest. Being aware of the total naivety of this, and after obtaining poor results, I am now looking for a more improved model.

My leads are the following : classify the series for each dimension (using KNN algorithm and DWT), reduce the dimensionality with PCA and use a final classifier along the multidimensions categories. Being relatively new to ML, I don't know if I am totally wrong.


Posted 2017-05-09T08:33:11.113

Reputation: 93

What you are doing is a pretty good approach. How many samples do you have in your dataset? – Kasra Manshaei – 2017-05-09T09:11:18.457

I have about 500 000 time series (recalling that each time series is 8 timestamp * 40 dimensions ) – AugBar – 2017-05-09T09:17:17.947

Have you tried just using the 320 features raw? 320 features is not a lot for 500,000 samples – Jan van der Vegt – 2017-05-09T09:41:51.850

@Jan van der Vegt : I have tried that method using a neural network, but the results were not so convincing - i used the raw data without any pre-processing. What operations should I apply beforehand on my 320-features raws to feed the classifier ? – AugBar – 2017-05-09T10:50:59.167

1In case of a neural network normalizing your input is important, depending on the range of your features that might matter. But I would just feed the raw features into a RF and see how well that works, requires less tuning to see if you can get anything out of it easily – Jan van der Vegt – 2017-05-09T11:05:46.040



You're on the right track. Look at calculating a few more features, both in time and frequency domain. As long as number of samples >> number of features, you aren't likely to overfit. Is there any literature on a similar problem? If so, that always provides a great starting point.

Try a boosted tree classifier, like xgboost or LightGBM. They tend to be easier to tune hyperparameters, and provide good results with default parameters. Both Random Forest and boosted tree classifiers can return feature importance, so you can see which features are relevant to the problem. You can also try removing features to check for any covariance.

Most importantly though, if your results are unexpectedly poor, ensure your problem is properly defined. Manually check through your results to make sure there aren't any bugs in your pipeline.


Posted 2017-05-09T08:33:11.113

Reputation: 326


If you're in Python, there are a couple of packages that can automatically extract hundreds or thousands of features from your timeseries, correlate them with your labels, choose the most significant, and train models for you.

Doctor J

Posted 2017-05-09T08:33:11.113

Reputation: 183


You can add more features to your dataset as below.

  1. You can try nolds package if your data is from a highly non linear process.

  2. max, min, mean , skew, kurtosis, and if possible some rolling stats.

I am working on something similar, and I asked a related question.

Anurag Upadhyaya

Posted 2017-05-09T08:33:11.113

Reputation: 183


I do agree with Jan van der Vegt, standardization (e.g, [-1, 1]) or normalization N(0, 1) combined with the activation function can be very important with neural networks. I would check the dissertation of Pichaid Varoonchotikul: “Flood forecasting using artificial neural networks” for the ins and outs of ANNs. It has very interesting caveats. Anyway, I'm use to try first without, but when results are unsatisfactory, I'm use to made trials with either both. Not sure it will help but I would check the R package TSclust and related docs. The authors are very kindly and they will help you to find specific models to do so. They are experts on time series analyses! Good luck!

Rafa M. Mas

Posted 2017-05-09T08:33:11.113

Reputation: 17