Multidimensional Dynamic Time Warping Implementation in Python - confirm?


I believe that I implemented MDTW in python here but I don't know if I did it correctly. The results seem intuitive. Can someone look at this code and tell me if you see anything wrong?

A lot of the papers I read on the subject kinda go over my head, but this just takes the mse of all column vectors in matrix s1 and s2 at each time point inside a given window w instead of single dimensional dynamic time warping which just takes the mse error of each value between two vectors and finds the smallest one.

I use panda's Panel, which is essentially a 3D version of a DataFrame.

Also, after I have this, I want to do clustering with some multidimensional time series. Thoughts on which clustering algorithm to run? Kmeans? Hierarchical? I'll start by building a dendrogram at least.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

def MDTWDistance(s1, s2, window=10, num_columns=1):

    w = max(window, abs(len(s1)-len(s2)))

    for i in range(-1,len(s1)):
        for j in range(-1,len(s2)):
            DTW[(i, j)] = float('inf')
    DTW[(-1, -1)] = 0

    for i in range(len(s1)):
        for j in range(max(0, i-w), min(len(s2), i+w)):
            #print "Finding Distance of", s1.loc[i], s2.loc[j]
            dist= mdist(s1.loc[i], s2.loc[j], num_columns)
            #print "Dist", dist
            #print i, j, dist
            DTW[(i, j)] = dist + min(DTW[(i-1, j)],DTW[(i, j-1)], DTW[(i-1, j-1)])

    return np.sqrt(DTW[len(s1)-1, len(s2)-1])

def mdist(a, b, num_col):
    dist = 0
    for col in range(num_col):
        #print "Finding Distance of", a[col], b[col]
        dist = dist + (a[col]-b[col])**2
    return dist




timeSeries = pd.Panel({0:pd.DataFrame(np.transpose([ts1, ts2])),
                 1:pd.DataFrame(np.transpose([ts3, ts4])),
                   2:pd.DataFrame(np.transpose([ts5, ts6]))

print "0 and 1:",MDTWDistance(timeSeries[0], timeSeries[1],window=10, num_columns=2)
print "0 and 2:",MDTWDistance(timeSeries[0], timeSeries[2],window=10, num_columns=2)
print "1 and 2:",MDTWDistance(timeSeries[1], timeSeries[2],window=10, num_columns=2)

The output is as follow:

0 and 1: 89.354619036
0 and 2: 58.8268328591
1 and 2: 133.434513377

With the graph:

enter image description here

EDIT: I found the distance using an MDTW package in R with the following code:

import rpy2.robjects.numpy2ri
from rpy2.robjects.packages import importr

# Set up our R namespaces
R = rpy2.robjects.r
DTW = importr('dtw')

# Generate our data
idx = np.linspace(0, 2*np.pi, 100)
template = np.cos(idx)
query = np.sin(idx) + np.array(R.runif(100))/10

for i,j in ([0,1], [0,2], [1,2]):
    print "From", i, "to", j, R.dtw(np.transpose(np.array(timeSeries[i])), np.transpose(np.array(timeSeries[j])), keep=True).rx('distance')[0][0]

which output:

From 0 to 1 186.623310713
From 0 to 2 119.769089068
From 1 to 2 272.849560995

So these numbers are about double the distances I have... so I'm not far off, but there's something wrong with what I'm doing. Halp.


Posted 2017-05-26T16:41:32.213

Reputation: 91

Did u manage to find what was causing the difference between your method and the R implementation? I just finished implementing my own multivariate DTW distance and got results very close to yours (89.378 for 0 and 1, 59.01 for 0 and 2 and 133.43 for 1 and 2). I guess our results are still usable for time series comparison since they seem to be homotetic to the R implementation, but this still bugs me. Last thing : how fast is your method? I will need to use mine a large amount of times, and it runs in about 60 ms for a window of 10. Not sure if this will scale. Thanks for your post! – Toussaint Behaghel – 2017-09-15T15:09:01.883

I never figured it out, sadly. Mine is about as fast as yours but can certainly be made more efficient. I was going to use this on large data sets but ended up just flattening the time series to do analysis. Not particularly exciting. The R package could be wrong, too. Please let me know if you figure yours out. – pennydreams – 2017-09-23T02:49:45.357



Could the difference be caused by the step pattern you are using? Dtw in R defaults to the symmetric2 step pattern. Explanation of the step pattern from Toni Giorgino's paper:

symmetric2 is normalizable, symmetric, with no local slope constraints. Since one diagonal step costs as much as the two equivalent steps along the sides, it can be normalized dividing by $N+M$ (query+reference lengths).

I thought this might be the problem after reading Comparing Dynamic Time Warping in R and Python.

If your problem is the same try adding step=symmetric1 to your R.dtw arguements.


Posted 2017-05-26T16:41:32.213

Reputation: 111

Thanks for sharing the link to the blog post. I added step_pattern = R.symmetric1 in my R.dtw arguments like outlined in the post, but it threw an error:

RRuntimeError: Error in globalCostMatrix(lm, step.matrix = dir, window.function = wfun, : step.matrix is no stepMatrix object

Not sure why. I'll have to come back to it later – pennydreams – 2017-12-02T16:42:21.717


The code shown here is a recursive implementation of dynamic programming used for time series analysis for similiarity, there is though a more optimal implementation named Fast Dynamic Time Warping which significantly reduces time and memory complexity to $O(n)$.

You will see a significant improvement from original DTW although the measured distance will be directly impacted depending on the distance measurement used (usually is the ecludidean distance as shown here). But there are others that would provide better results depending on the time series nature.

Miguel Miron

Posted 2017-05-26T16:41:32.213

Reputation: 1