## How to formally tell if one time series affects another?

10

12

In time series analysis, a correlogram, also known as an autocorrelation plot, is a plot of the sample autocorrelations. I'm rolling my own now and I'm not a statistician, but I think this sort of thing should probably be built in, perhaps I just haven't found it?

Problem: I have two time series and am trying to see if one has any affect on the other. My current solution is to calculate the time indices of outlier events (peaks and valleys) in both series and then plot a histogram of all the differences between those times, and then manually looking for spikes at any specific time lags.

Question: How to best do this with Mathematica v10's new TimeSeries[] and Statistics functionalities?

How about CorrelationFunction? Put the same sequence as both arguments and it is autocorrelation. – bill s – 2015-02-02T16:07:00.160

15

Alex Isakov has a Granger Causality Test in his Economica Time Series package here:-

Mathematica Package Repository - Economica

I'm not very familiar with the details, but I ran some tests using data from here:-

Dave Giles' Blog - Testing for Granger Causality

I downloaded the example data from the Data page. Here it is stored as QR codes.

qrimage = BarcodeImage[Compress[datalist]];


Dates

Arabica data

Robusta data

{dates, arabica, robusta} = Map[
Uncompress@BarcodeRecognize@Import[#] &,
{"http://i.stack.imgur.com/KdrUd.png",
"http://i.stack.imgur.com/hCbZV.png",
"http://i.stack.imgur.com/J9ESY.png"}];

arabica = Last /@
Select[Transpose[{DateList /@ dates, arabica}], #[[1, 1]] > 1975 &];
{dates, robusta} = Transpose[
Select[Transpose[{DateList /@ dates, robusta}], #[[1, 1]] > 1975 &]];

DateListPlot[{Transpose[{dates, arabica}],
Transpose[{dates, robusta}]}, Joined -> True]


Alex Isakov's Granger Causality function

getLags[data_, i_] := Drop[data, -i]

GrangerCausalityTest[dat_, lag_: 3] := Module[
{xx, x, y, laggeddata, le, res1, res2, grangerstat},
le = Select[dat, (And @@ (NumericQ /@ #)) &];
xx = Flatten[Most /@ le]; y = Last /@ le;
laggeddata = Transpose[Map[Take[#, -Length[y] + lag] &,
Join[Table[getLags[y, i], {i, lag}],
Table[getLags[xx, i], {i, lag}], {y}]]];
res1 = Total[LinearModelFit[laggeddata,
Table[Subscript[x, i],
{i, Length[First[laggeddata]] - 1}],
Table[Subscript[x, i],
{i, Length[First[laggeddata]] - 1}]]["FitResiduals"]^2];
res2 = Total[LinearModelFit[laggeddata,
Table[Subscript[x, i],
{i, (Length[First[laggeddata]] - 1)/2}],
Table[Subscript[x, i],
{i, Length[First[laggeddata]] - 1}]]["FitResiduals"]^2];
grangerstat = ((res2 - res1)/lag)/(res1/(Length[laggeddata] - 2*lag - 1));
1 - CDF[FRatioDistribution[lag, Length[laggeddata] - 2*lag - 1],
grangerstat]];


Checking a range of lag intervals.

a = Array[GrangerCausalityTest[Transpose[{robusta, arabica}], #] &, 10];
b = Array[GrangerCausalityTest[Transpose[{arabica, robusta}], #] &, 10];
ListLinePlot[{a, b}]


GrangerCausalityTest@Transpose[{arabica, robusta}]


0.127273

indicating low significance of null hypothesis.

GrangerCausalityTest@Transpose[{robusta, arabica}]


0.8631

Quoting from Dave Giles' page:

"In summary, we have reasonable evidence of Granger causality from the price of Arabica coffee to the price of Robusta coffee, but not vice versa."

Check

Using Free Statistics Software with the coffee data, plotting the results for the first ten lags.

ListLinePlot[{
{0.4541, 0.8248, 0.0696, 0.0454, 0.061,
0.0445, 0.0201, 0.0052, 0.0039, 0.0307},
{0.2605, 0.3755, 0.7118, 0.4239, 0.3455,
0.2779, 0.1233, 0.0361, 0.0055, 0.0172},
ConstantArray[0.05, 10]}, AxesOrigin -> {0, 0},
Epilog -> Inset["P-value 0.05", {1.5, 0.08}]]


The results indicate a rejection of the hypothesis that the price of Arabica coffee does not affect the price of Robusta coffee.

The shape of the graph is similar to that produced by Alex's function, but clearly there are differences.

The results from Alex's function match exactly the results from here:

MathGroup archive: implementation of Granger causality tests in Mathematica

dated Jan 2006, by Darren Glosemeyer, Wolfram Research.

(Runs in version 7 but not 10.4, requiring legacy Statistics package.)

I've installed the package, and the help for GrangerCausalityTest gives '?GrangerCausalityTest'

'GrangerCausalityTest[dat, lag] returns a p-value for the null hypothesis that the first element in dat does not cause the second element in dat.' Thus I think that the above interpretation of the result is not correct – sam84 – 2016-05-23T16:11:36.357

@sam84 It appears that I should have been looking at the p-value in the 3rd lag (the default) to match Professor Giles' conclusion. The significance is not as emphatic as in Giles' method, but the calculations appear to be rather different even though using the same data. – Chris Degnen – 2016-05-25T15:59:05.397

4

I've installed the package, ?GrangerCausalityTest' gives

GrangerCausalityTest[dat, lag] returns a p-value for the null hypothesis that the first element in dat does not cause the second element in dat.

We remind the definition of the p-value (from Wikipedia):

the p-value is widely used in statistical hypothesis testing, specifically in null hypothesis significance testing. In this method, as part of experimental design, before performing the experiment, one first chooses a model (the null hypothesis) and a threshold value for p, called the significance level of the test, traditionally 5% or 1% and denoted as α. If the p-value is less than or equal to the chosen significance level (α), the test suggests that the observed data is inconsistent with the null hypothesis, so the null hypothesis must be rejected. However, that does not prove that the tested hypothesis is true. Nonetheless, it helps to clarify that p-values should not be confused with probability on hypothesis (as is done in Bayesian Hypothesis Testing) such as Pr(H|X), the probability of the hypothesis given the data, or Pr(H), the probability of the hypothesis being true, or Pr(X), the probability of observing the given data.

I tested with synthetic data the above function. Let's set α=5%. Therefore, if the output of GrangerCausalityTest[dat, lag] is <0.05, then we reject the hypothesis that the first element in dat does not cause the second element in dat. On the other hand if the output is > 0.05, then the we cannot reject the hypothesis that the first element in dat does not cause the second element in dat.

First I've generated two time-series (ts1,ts2), with unilateral causation ts1 -> ts2, i.e. the first time series causes the second one, but not viceversa.

ts1=Range[1,100];
ts2=Table[x + RandomReal[{-x/3, x/3}], {x, ts1}];


Applying GrangerCausality Test on these data, I found:

In[1]:= GrangerCausalityTest[Transpose[{ts2, ts1}], 1]

Out[1]= 1

In[2]:= GrangerCausalityTest[Transpose[{ts1, ts2}], 1]

Out[2]= 1.11022*10^-15

Thus, as the help function explains, the results should be interpreted as

The null hypothesis that ts2 does not cause ts1 is not rejected. Indeed, this is consistent with our generated data, as we have impose an unilateral causation ts1 -> ts2

The null hypothesis that ts1 does not cause ts2 is rejected. Indeed we know that ts1->ts2.

Therefore, with respect the coffee example above, the correct interpretation of the result is

The Granger causality test gives no evidence of causality from the price of Arabica coffee to the price of Robusta coffee, and vice versa.

I have updated my post in view of your observation. Thanks. I have also added the data if you would like to try it out. I think Professor Giles' conclusion, that "we have reasonable evidence of Granger causality from the price of Arabica coffee to the price of Robusta coffee", must be based on the 3rd lag. I'd be most interested in your comments. – Chris Degnen – 2016-05-25T16:03:14.417

@ChrisDegnen I agree. Looking at the 3rd lag the test is in agreement with prof. Giles' conclusion. The "reasonable evidence" should be more appropriately "evidence for significance level α=0.13 ". – sam84 – 2016-05-26T21:02:23.420