How to classify and cluster this time series data

4

5

I have post already the question few months ago about my project that I'm starting to work on. This post can be see here: Human activity recognition using smartphone data set problem

Now, I know this is based around multivariate time series analysis and tasks are to classify and cluster the data. I have gathered some materials (e-books, tutorials etc.) on this but still can't see a more detailed picture of how even I should start. Here's the tutorial that looks like it might be helpful but the thing is my data looks differently and I'm not really sure if this can be applied to my work.

http://little-book-of-r-for-multivariate-analysis.readthedocs.org/en/latest/src/multivariateanalysis.html#scatterplots-of-the-principal-components

So basically, my questions are:

How I can start on some very basic analysis? How to read data so it any meaning for me. Any tips and advises will be much appreciated! Note: I'm just the beginner in data science.

Jakubee

Posted 2014-09-28T12:51:43.823

Reputation: 401

Question was closed 2021-02-12T14:22:38.760

1Can you repost the important details in this question? e.g. what's the data look like, where'd it come from, is it labeled, etc. – gallamine – 2014-10-01T14:43:41.097

Answers

4

I have shared a number of resources on time series classification and clustering in one of my recent answers here on Data Science StackExchange: https://datascience.stackexchange.com/a/3764/2452. Hopefully, you will find them relevant to this question and useful.

Aleksandr Blekh

Posted 2014-09-28T12:51:43.823

Reputation: 6 438

1you are blessing in life - Really beautifully written. Working on something similar and really helped me a lot – Hardik Gupta – 2018-02-15T07:01:47.897

2@Hardikgupta Thank you for kind words. I'm happy that my answers are helpful. – Aleksandr Blekh – 2018-02-15T07:27:13.897

2

How I can start on some very basic analysis?

Take your labeled data and compute histograms of the values for each of the sets. Plot these and visually see if there's any differences. Also compute the mean and variance of each of the different labeled sets and see if there are differences.

If it's timeseries data, take small (overlapping) windows of time and compute various metrics - min, max, variance, mean, for instance - and use that as input to a classifier.

gallamine

Posted 2014-09-28T12:51:43.823

Reputation: 408