Function to calculate and plot sample variogram

7

I'm trying to create a function that calculates and plots the sample variogram for a time series. It should take two arguments: data, a time series, essentially a list of reals; k, an integer greater than 0. It should produce a ListLinePlot with $k$ on the horizontal axis and $\hat{G_k}$ on the vertical axis.

The plotting and such isn't really the problem. The problem is getting the mathematical formulas, which are below, into Mathematica formulas.

$\hat{G_k}$ is essentially a ratio of variance at lag $k$ over the variance at lag $1$:

$\hat{G_k}=\frac{s^2_k}{s^2_1},k=1,2,...$

$s^2_k=\frac{\sum_{t=1}^{n-k}(d^k_t-\bar{d}^k)^2}{n-k-1}$

$\bar{d}^k=(n-k)^{-1}\sum{}d^k_t$

$d^k_t=z_{t+k}-z_t$

$z_t$ is the time series of the length $n$.

I've been beating my head against this for a while now. Any help would be appreciated.

Edit: A variogram is statistical tool to assess whether a time series is stationary or not. For a stationary it starts out at $1$ for $k=1$ and then increases until it starts to hover around a certain value. The theory and formulas I have here is from Time Series Analysis and Forecasting by Example. You might be able to use the Search Inside at Amazon to have a look at it.

Edit2: Here is a demoplot based on dummydata as to how the end result should look for a stationary time series. Mathematica graphics

Mr Alpha

Posted 2012-02-28T18:02:10.190

Reputation: 3 046

1What's a variogram? Can you provide a link? – rcollyer – 2012-02-28T19:02:52.733

rcollyer: It's a version of a correlogram and can be directly computed with ListConvolve, ListCorrelate, et al. – whuber – 2012-02-28T19:39:55.843

As ListPlot can do the same thing as ListLinePlot, I removed the version-8 tag. – rcollyer – 2012-02-28T19:42:45.603

@rcollyer ListLinePlot was "New in 6" -- not a problem even for me ;-) – Mr.Wizard – 2012-02-28T20:38:11.433

@Mr.Wizard I swore it was new in 8. – rcollyer – 2012-02-28T22:33:38.060

@rcollyer That's OK, even I have trouble keeping versions straight sometimes... – Brett Champion – 2012-02-29T04:29:39.957

@BrettChampion that's actually comforting to know. Either way, the v.8 tag did not belong. – rcollyer – 2012-02-29T04:40:26.900

This question is practically a duplicate of http://mathematica.stackexchange.com/questions/2197/function-for-autocorrelation, which has much simpler answers.

– whuber – 2012-03-06T00:08:10.900

You have some programs to calculate variogram of a time eerie or an image, in Fractal Geography by A. Dauphiné, 2012, Wiley.

– Dauphine – 2012-02-29T07:04:55.020

Answers

4

Lag differences are not the same as second-differencing etc, so Differences is not the right approach.

test2 = Array[f, 10]

In[23]:= Differences[test2, 3]

Out[23]= {-f[1] + 3 f[2] - 3 f[3] + f[4], -f[2] + 3 f[3] - 3 f[4] + f[5], -f[3] + 3 f[4] - 3 f[5] + f[6], -f[4] + 3 f[5] - 3 f[6] + f[7], -f[5] + 3 f[6] - 3 f[7] + f[8], -f[6] + 3 f[7] - 3 f[8] + f[9], -f[7] + 3 f[8] - 3 f[9] + f[10]}

Building on kguler's answer, here is an alternative way of getting lag-differences. I have written it up as a separate function because it has more general utility than this specific application.

lagdif[l_List, k_Integer?Positive] /; k < Length[l] := 
  Drop[l, k] - Drop[l, -k]

We can then use a similar approach:

dif[list_] := Table[lagdif[list, k], {k, Length[list] - 1}];
dbar[list_] := Mean /@ dif[list];
var[list_] := Variance /@ (Most@(dif[list]));
variogram[list_] := ((Rest@#)/First@#) &@var[list];

Testing with I(1) data (i.e. non-stationary with a single unit root):

$y_t = y_{t-1} + \epsilon_t$

data = Accumulate[RandomReal[{-1, 1}, 20]];

GraphicsRow[{ListLinePlot[data], ListLinePlot[variogram[data]]}]

enter image description here

This makes me wonder how powerful the variogram is for data with shocks that have bounded support.

Verbeia

Posted 2012-02-28T18:02:10.190

Reputation: 33 191

4

Does the following direct implementation of your expressions work for you?

  ClearAll[data, dif, dbar, var, variogram];
  dif[list_] :=  Table[(list[[k + 1 ;;]] - list[[;; -(k+1)]]), 
  {k, Length[list] - 1}]
  var[list_] := Variance /@ (Most@(dif[list]));
  variogram[list_] := (#/First@#) &@var[list];

usage

 data = Accumulate[RandomReal[{-1, 1}, 100]];
 GraphicsRow[{ListLinePlot[data], ListLinePlot[variogram[data]]}]

enter image description here

Stationary data:

 data2 = RandomReal[{-1, 1}, 100];
 GraphicsRow[{ListLinePlot[data2], 
 ListLinePlot[Reverse@variogram[data2]]}]

enter image description here

kglr

Posted 2012-02-28T18:02:10.190

Reputation: 302 076

This is not quite right. I don't think Differences works correctly for this application for differences bigger than 1. – Mr Alpha – 2012-02-28T21:18:23.160

@MrAlpha, you are right. I changed the function to calculate the differences. Hope this one works. – kglr – 2012-02-28T23:00:58.120

2

My literal implementation taken directly from the equations is below:

ClearAll[gk];
gk[z_,k_]:=Module[{n,s2k,dk,s21,d1},
  n = Length@z;
  dk[t_] := z[[t+k]]-z[[t]];
  s2k = Sum[(dk[t]-Sum[dk[t],{t,n-k}]/(n-k))^2,{t,n-k}]/(n-k-1);
  d1[t_] := z[[t+1]]-z[[t]];
  s21 = Sum[(d1[t]-Sum[d1[t],{t,n-1}]/(n-1))^2,{t,n-1}]/(n-2);
  s2k/s21
]

and this will generate an example plot

With[{z = RandomReal[{0, 1}, 100]},
  ListLinePlot[{gk[z, #] & /@ Range[98], z}]
]

enter image description here

Note that in the third equation, I am assuming the sum is from 1 to n - k.

Daniel Flatin

Posted 2012-02-28T18:02:10.190

Reputation: 451

1

I don't know if it is OK to answer your own question, but I thought I would give an answer as to what I ultimately ended up with. This is essentially based on Verbeia's answer.

I've added a option for how many lags to calculate since calculating the variance for lags up to $n-1$ doesn't make much sense, since the number of data points you have decrease with the size of the lag. This leads to the estimate of the variance being uncertain.

lagdif[l_List, k_Integer?Positive] /; k < Length[l] := 
 Drop[l, k] - Drop[l, -k]
variogram[list_, 
   k_] := (#/First@#) &@(Variance /@ Table[lagdif[list, i], {i, k}]);
plotvariogram[list_, k_] := 
 GraphicsRow[{ListPlot[list, Joined -> True, 
    AxesLabel -> {"t", "\!\(\*SubscriptBox[\(X\), \(t\)]\)"}], 
   ListPlot[variogram[list, k], AxesOrigin -> {0, 0}, 
    AxesLabel -> {"k", 
      "\!\(\*SubscriptBox[OverscriptBox[\(G\), \(^\)], \(k\)]\)"}, 
    Joined -> True]}]

Here are some example outputs:

With NIID data Mathematica graphics

With AR(2) data Mathematica graphics

With a random walk data Mathematica graphics

Mr Alpha

Posted 2012-02-28T18:02:10.190

Reputation: 3 046