Why TimeSeries forecasts the future value poorly?

7

4

On a bank account there are many recurring entrance and expenses.

I've managed to import some sample data and to plot it.

I can see that there is a quite clear recurring trend (one wage is coming on 22 of the month and another on 27th of every month and many payments happen in the first days of the month).

What could help me in creating a forecast FutureMoney[today, futureDay, actualAmount]

ClearAll[f];
ClearAll[file1];
f = DateString[{#, {"Day", "/", "Month", "/", "Year"}}] &
file1 = Import["C:\\Users\\Utente\\Documents\\1.csv"];
data1 = file1[[2 ;;, {1, 2}]];
data = MapAt[f, data1, {All, 1}];
tsm = TimeSeriesModelFit[data];
DateListPlot[{tsm["TemporalData"], TimeSeriesForecast[tsm, {30}]}]


Any help?

3Hi Revious, have you asked this before? Or do I have some deja vu? – Carl Lange – 2019-10-30T15:47:24.813

the title has nothing to do with question and there is no failure reported. – Gosia – 2019-11-05T19:52:53.930

@CarlLange: yes, but no one answered, so I deleted it.. – Revious – 2019-11-07T23:26:51.787

2Naively using some kind of time forecasting method might not be the best approach. What about creating a histogram over money withdrawn or inserted into the account on day x? That could be a good start. If the 22nd och 27th is on a weekend that can change the pattern, so then in those cases move the money deposit to the next weekday instead. It's always good to start by implementing a simple baseline so that you know if the fancy methods you might try later add any real benefit. Otherwise you'll end up using neural networks for problems that can be solved with linear interpolation. – C. E. – 2019-11-08T08:40:20.627

3You need to devise a way to test your predictions as well. Typically, you would train on the first 3/4 of the data, say, and then predict the last 1/4. Then you would compare your prediction with your last 1/4 of real data (the so-called ground truth.) You would use some kind of distance measurement, such as the point-wise euclidean distance, and track that measurement over all of your different attempts. – C. E. – 2019-11-08T08:42:39.333

@C.E. but how a similar huge task could be achieved in Mathematica? It seems something near to Neural Network – Revious – 2019-11-08T13:37:04.090

@Revious I'm afraid I don't understand your question. – C. E. – 2019-11-08T13:44:51.343

@C.E. I'm sorry, I meant that I'm not so experienced with Mathematica.. I cannot imagine how to build a predictor or how to look for a pattern. Are there functions for that or is more about programming? – Revious – 2019-11-08T13:48:09.913

@Revious I'm afraid that programming is a prerequisite for that type of analysis, but that's true for almost any somewhat complicated data analysis these days. – C. E. – 2019-11-08T20:48:15.853

8

To get output that looks a bit more reasonable, modify the second parameter to TimeSeriesModelFit.

ds = SemanticImport["https://textbin.net/raw/ikfqPWq6Iy"]

ts = TimeSeriesResample@TimeSeries[Values@Normal@ds]


Now, we can get output that has the repetition you were expecting with, for instance, an "ARMA" (auto-regressive moving-average) model:

tsm = TimeSeriesModelFit[ts, {"ARMA", {100, 5}}];
DateListPlot[{tsm["TemporalData"], TimeSeriesForecast[tsm, {160}]}]


Modifying the parameters of the model (above, I picked 100 and 5 for the autoregressive order and moving average order respectively) will give different output. For instance, here's {20,30}:

There's a large list of models and their parameters in the documentation for TimeSeriesModelFit.

I'm not qualified to give you a good explanation of these different models or their parameters, but I hope this is a good start for you!

Thank you very much. May I ask you what the "@" syntax means? I'm trying to understand this part: TimeSeriesResample@TimeSeries[Values@Normal@ds] – Revious – 2019-11-08T13:39:45.237

2Ah, it's the prefix operator, which is equivalent to: TimeSeriesResample[TimeSeries[Values[Normal[ds]]]] - so f@x === f[x]. – Carl Lange – 2019-11-08T14:27:00.523

-2

forecast = TimeSeriesForecast[tsm, {30}] returns a TimeSeries object, so you can see what the value is on a given date by forecast[date_DateObject] or print all the values forecast["Values"].

1

it seems a very weird forecast, the timeseries is cyclic on a 30 days period and this is the forecast https://i.imgur.com/MMVsqPl.png

– Revious – 2019-11-07T23:31:54.217