6

I have been trying to obtain VAR model parameters using EstimatedProcess and FindProcessParameters. It seems to take a very long time as soon as I have more than 6 or 7 time series and lag length >2.

The data I am using is the z-scored daily log return of financial time series, with around 1000 data points.

It works fine for up to 8 series with lag 2 and only up to 6 series with lag 3 (taking 15 minutes for the later). Anything above that doesn't seem to work or is either really slow. EstimatedProcess has been running for 3 hours now for an p=3 VAR process with 7 times series.

I have also attempted this on data I simulated using ARProcess and the problem is the same, it takes a long time and above a certain number of series and lag length it doesn't return results (unless I didnt wait long enough). Here is the unprocessed data (data71517) https://www.wolframcloud.com/obj/078b7f99-bd9c-4b3e-b0ca-07a8cf65693c and code I am using:

```
stocks7 = {"GOOG", "FB", "AMZN", "NFLX", "AAPL", "MSFT", "TSLA"};
data71517 =
Table[QuantityMagnitude[
FinancialData[list, "Close", {date}]], {list, stocks7}, {date,
DateRange[{2015, 1, 1}, {2017, 12, 31}][[All, 1 ;; 3]]}];
n = 7;
datath = Table[Thread[{Range[1096], data71517[[i]]}], {i, n}];
datap = Table[Partition[
If[datath[[i]][[#, 2]] == datath[[i]][[# + 1, 2]], # + 1,
Nothing] & /@ Range[1095], 1], {i, n}];
datacl = Table[Delete[datath[[i]], datap[[i]]], {i, n}];
datalo = Table[Log[Interpolation[datacl[[i]], pts]], {i, n}, {pts, Range[1096]}];
datadif = Table[Differences[datalo[[i]]], {i, n}];
dataz = Table[(datadif[[i]] - Mean[datadif[[i]]])/(1*
StandardDeviation[datadif[[i]]]), {i, n}];
elm := Function[{a, b, c}, Table[ToExpression[ToString
[a] <> ToString[i] <> ToString[j]], {i, b}, {j, c}]]
es = EstimatedProcess[Transpose[dataz],ARProcess[{elm[aa, n, n], elm[ab, n, n], elm[ac, n, n]}, elm[e, n, n]]]
```

Any help would be greatly appreciated.

This cannot be answered without any code and without the underlying dataset. – Henrik Schumacher – 2020-02-10T06:00:25.263

Hm. I cannot really tell what you are doing, but already the way you load the data is super inefficient. Better load it with one query like

`a = FinancialData[stocks7, "Close", {{2015, 1, 1}, {2017, 12, 31}}];`

. The`datacl`

looks as if you try to remove those days on which there is actually no data. You only have to do that because you enforced to get data also for the missing days. If I am not mistaken, you can just do`datacl = QuantityMagnitude /@ Through[a["Values"]];`

. But the again you interpolate on the list of all days with`datalo`

. So I do not get it, really. – Henrik Schumacher – 2020-02-10T07:46:07.930What looks really, really evil is

`elm`

. What is it supposed to do? – Henrik Schumacher – 2020-02-10T08:01:09.607Yes I know it is quite inefficient and there is probably a better way of doing this. What I am doing is forcing it to give me data for all days such that the data sets for all the stocks have equal length, I am then replacing the "filler" data with interpolation. If I were to do FinancialData[stocks7, "Close", {{2015, 1, 1}, {2017, 12, 31}}]; I would get data sets of unequal length, sometimes of equal length but with different days missing, and wouldnt know where to interpolate. – Charles Dörre – 2020-02-10T08:02:13.127

elm just gives me a matrix of coefficients, e.g elm[a,2,2] would give me {{a11,a12},{a21,a22}}. Saves me from having to write matrices of coefficients out as I would like to eventually do this for a larger number of series. – Charles Dörre – 2020-02-10T08:04:52.573

Oh, that can be done easier with

`Array[aa, {n, n}]`

,`Array[ab, {n, n}]`

,`Array[ac, {n, n}]`

, etc. – Henrik Schumacher – 2020-02-10T08:08:28.923The

`TimeSeries`

objects in the list`a`

do already an "interpolation": For a nonexisting date, the data of the last previous date is used. That is much more meaningfull than interpolation, isn't it? – Henrik Schumacher – 2020-02-10T08:20:20.993I am not sure I follow. When I do FinancialData[stocks7, "Close", {{2015, 1, 1}, {2017, 12, 31}}] I get 7 time series, of length 746, 749, 749, 755, 755, 755, 743. If there is a way of getting the last value carried forward that would be great, as obtaining the data would be much faster. I would still be replacing the last value carried forward by an interpolation though as I am also trying to study the correlation structure and partial directed coherence between the series and those measures would be spiked from sets having stagnant data in the same positions. – Charles Dörre – 2020-02-10T08:31:14.430

Oh I see I think I figured out how to get the last value carried forward using the timeseries generated by your code. Thanks! that will save me a lot of time when downloading the data. – Charles Dörre – 2020-02-10T09:11:51.543

You're welcome. – Henrik Schumacher – 2020-02-10T09:15:52.670