TimeSeriesModelFit incorrect?


I am trying to determine the optimal model for time series data (1592 observations). When I run TimeSeriesModelFit, Mathematica selects $AR(1)$ model, the estimated $a_1$ coefficient is -0.067, and the Akaike Information Criterion (AIC) is supposed to be 8380 (I obtained this result using the "CandidateSelectionTable" Property). However, when I ran the regression of the $AR(1)$ model in Stata (MLE), I got AIC of 12 898.08 but the estimate of the $a_1$ coefficient was almost exactly the same as well as the estimate of the intercept. At this point, I downloaded Gretl, ran the model again (MLE), and got AIC of 12 898.08, i.e. the same as Stata had computed. Does Mathematica compute the AIC in a wrong way or am I doing a mistake somewhere? Also the Bayesian Information Criterion (BIC) differs significantly for the respective models in Mathematica and Stata. The data that I use is data=Differences[FinancialData["^GSPC","Close",{{2009,1,1},{2015,5,1}}][[All, 2]]], i.e. daily close price differences of the S&P 500 index.

EDIT: I suspect there is something wrong in the computation of log-likelihood by the TimeSeriesModelFit which is essential in determining the AIC. One should calculate it as $2k-2\ln\left(L\right)$, where $\ln\left(L\right)$ is the log-likelihood and $k$ number of estimated parameters. If I separately run LogLikelihood[ARProcess[0.788, {-0.067}, 192.514], data] (the numbers inside the ARProcess are those obtained from TimeSeriesModelFit and are almost the same as what Stata estimates), I get -6446.04 - that is the same as in Stata. This is most likely the correct result, but TimeSeriesModelFit probably computes different log-likelihood and hence wrong AIC.


Posted 2015-05-07T08:18:38.427

Reputation: 545



The calculation of AIC for a particular model varies among software programs and yet none of those are necessarily wrong. The difference among software programs (that do it correctly) is because some leave off constants that don't vary with the data.

What counts is that the difference of the AIC values between two different models in the same software match across software packages. (Even different procedures in SAS sometimes consistently obtain different AIC values for the same model but within a procedure the difference in AIC values of different models match up just fine.)

So the real test is to run a different model (with the same data) and compare the difference in AIC values between the two models among the different software packages.

(I hope this is considered an "Answer" rather than a "Comment", but I understand if that's not the case. I don't have enough credit to comment.)


Posted 2015-05-07T08:18:38.427

Reputation: 28 387

+1 Wholeheartedly agree. I was about to post an answer along this same lines. In Mathematica's implementation constants are dropped, therefore AIC differs from $2\left(k - \log\mathcal{L}\right)$. – Sasha – 2015-05-07T18:02:21.490


Well, the thing is the documentation says that Akaike IS computed as $2k - 2\ln\left(L\right)$. Moreover, if I run TimeSeriesModelFit[data,"GARCH"], the optimal model from the GARCH family is GARCH(1,1) with AIC about 37000, i.e. the AIC is much higher than that of the AR(1) model - that is not surprising since Mathematica had selected the AR(1) over GARCH(1,1). But when I run GARCH(1,1) in Stata, I get lower AIC than that of AR(1) from Stata, i.e. according to Stata, I should prefer GARCH(1,1) to AR(1)...

– Skumin – 2015-05-07T18:21:09.087

2Further (and possibly a slightly more relevant) comment: difference in AIC in Mathematica for ARMA(1,1) and ARMA(2,1) is 0.33, while the same difference is 1.81 in Stata and Gretl. – Skumin – 2015-05-07T19:36:07.763

1@Skumin. I concede you have a point. SAS and R both give 1.81 for that difference. And for AR(1) vs. AR(2) both SAS and R give 0.025 as the difference in AIC values while Mathematica gives 0.041. Also it does appear that the AR models leave off a different constant than for the GARCH models (which SAS and R do not do) and that makes it impossible to compare the fit between those two models using AIC. So maybe a bug report is justified. – JimB – 2015-05-08T14:26:05.643

2@Skumin. The ar function in R does give a difference of 0.041 when using the Yule-Walker method and 0.025 when using maximum likelihood. Might Mathematica's TimeSeriesModelFit be picking and choosing from different methods (although I don't see the options to pick Yule-Walker vs. maximum likelihood or the Burg method)? – JimB – 2015-05-08T16:48:30.180

@JimBaldwin I don't really know whether Mathematica uses different methods... But it's probably the only possible explanation for the "differences in differences" of AICs. – Skumin – 2015-05-10T11:50:32.083