## Time Series Analysis - How does it Difference?

5

1

In languages other than Mathematica (especially R) if you want to do any time series analysis, you must always difference the series according to some feature of the series in order to satisfy at least weak stationarity. This is especially relevant in cases where you need to remove some kind of trend.

Here's some code which generates a time series, applies an arbitrary trend and does a quick model fit. It seems to work and if you checkout the plot, it seems to be applying the trend correctly.

monthlyObservations = TimeSeries[
WeatherData["KORD",
"Temperature", {{2008, 1, 1}, {2014, 12, 31}, "Month"}]];
TimeSeries[
Range[10, 20 - 10/84, 10/84]}]
];


It does appear to difference the data to remove trend, fit a model and then forecast correctly. My question is - how can I see what type of differencing it is using? Is there any way that I can control the type/size of the differencing (ie 1 day difference versus more days?)

I've scoured the documentation and perhaps I've missed it but I haven't yet found a spot where Wolfram explains exactly what kind of magic differencing technique they are doing here. There are a few other StackExchange threads where this is mentioned but nothing about to see what it's doing. Is there some parameter I can look at to see the technique? Thanks!

9

You posted what appears to be incomplete code but if I'm interpreting correctly you fit a model with TimeSeriesModelFit and it returned a model which you then used to create a forecast as such.

monthlyObservations =
TimeSeries[
WeatherData["KORD",
"Temperature", {{2008, 1, 1}, {2014, 12, 31}, "Month"}]];
TimeSeries[
Range[10, 20 - 10/84, 10/84]}]];

Plot[tsm[t], {t, 0, 100}]


I should briefly mention that when plotting a time series model it uses the original series which is linearly interpolated by default (you can set this) and then starts using a forecast which is zero-order interpolated by default (the correct behavior for processes based on difference equations like SARIMA), hence the strange looking plot.

That said, you can discover the differencing being used by looking at the underlying model.

tsm["BestFit"]

(* SARIMAProcess[0.174742, {-0.322022}, 1, {}, {12,{-0.0118158}, 1, {-0.860908}},7.07733]*)


This has both nonseasonal and seasonal difference operators of order 1. You can change which is being used by fixing those when fitting the model. For example...

TimeSeriesModelFit[trendAdded, {"SARIMA",{12,1,1}}]


The way it is actually accomplishing this under the hood is with Differences which has both a two argument form, for nonseasonal differencing, and a three argument form which allows for seasonal differencing.

diffed = Differences[Differences[trendAdded, 1], 1, 12]
ListLinePlot[diffed]


I see - so it differences based on the order of the model you suggest it? ie {"SARIMA",{12,1,1}} will do both differencing of 1 and 12? Thanks for your answer! Mathematica's time series functionality is mindblowingly great compared to R and other tools that require you to do this manually. – Tom Hayden – 2014-12-21T00:18:54.237

Yes, it uses the orders of the model to do the differencing. I will say that R still has its place, where it lacks in automation and sometimes ease of use it makes up in completeness. Mathematica's time series functionality is great but still immature so there will be things you can't easily do that may cause you to turn to R. If you go there I recommend starting with auto.arima which is the closest I've seen to TimeSeriesModelFit. Incidentally you can get the best of both worlds by getting familiar with RLink. – Andy Ross – 2014-12-21T00:49:55.593