5

3

I am interested in clustering $N$ time series of $T$ 'values' each. These values are distributions (which can be represented by their cumulative distribution functions (cdf), or their probability density functions (pdf), or more convenient forms such as square-root pdfs yielding a simple spheric geometry).

For comparing given distributions, there is an extensive literature on statistical distances (KL, Hellinger, Wasserstein, and so on), but for comparing given time series of distributions, I am not sure whether there is any literature at all?

Such distances should somehow take into account dynamics information besides the distribution proximity at time t. Ideally, I wish I could have a kind of information factorization similar to this result.

I am wondering if such distances already exist and whether this kind of problem has already been formulated in the literature?

-- edit for further precisions and answer to comments:

Thanks for your answer, but dynamic time warping does not suit to my need. This dp technique only captures a rough similarity of shapes by allowing non-linear time distortion. But, it does not amount for the whole information in these time series, e.g. what about the distribution of distortions? Do the distributions of a given time series vary smoothly through time or violently? DTW is not always the solution, for instance, when working with random walks, it does not make sense to use a DTW since there are no time patterns! In this case, the only information is "correlation" and "distribution" (cf. Sklar's theorem in Copula Theory), and the paper cited above.

-- edit 2 Here are the papers that are somehow related to my question:

- Predicting the Future Behavior of a Time-Varying Probability Distribution
- Clustering on the unit hypersphere using von Mises-Fisher distributions
- Unsupervised clustering of multidimensional distributions using earth mover distance
- Hilbert space embeddings of conditional distributions with applications to dynamical systems

Check my answer on

– Aleksandr Blekh – 2015-06-12T07:59:24.580DTW clusteringof time series.Basically, I have a time series of time series. Let's assume that I use at time $t$ a DTW (but I would rather use $\phi = \arccos \langle p,q \rangle$), how to extend it to the whole time series? This is really my point. – mic – 2015-06-12T08:36:28.710

You're welcome. My advice does not imply that I think that TDW is universal solution. I just thought that papers, referenced in my linked answer,

potentiallymight contain someideas, useful to your case. I don't have an answer for your "time series of time series" case. As for analyzing distortions, you could consider applying time seriesanomaly detection and analysisapproaches. – Aleksandr Blekh – 2015-06-12T09:05:50.133Have you tried mutual information? – Alexandru Daia – 2015-06-12T06:05:08.283

For measuring dependency between variables, I prefer using copulae, [though mutual information and copula are very much the same]{http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6077935%7D%21 Yet, in my case, dependency is not the only information I care about / in this kind of time series. In fact, I wish I could obtain [a result similar to this one]{http://arxiv.org/pdf/1506.00976v1.pdf%7D.

– mic – 2015-06-12T06:33:56.677Here's an idea way-out-of-left field - a time-series of pdfs can be thought of as a solution to a Fokker-Planck-type PDE (yes/no/maybe?). Would it be feasible to fit such a PDE to your samples and then cluster the PDE's coefficients? – alexandre iolov – 2015-11-19T08:24:35.733