## Sales forecast using SARIMAProcess and time-series data

15

9

I would like to ask for help in how to use the new Mathematica 9 time series functions to make some sales forecast.

For example, for one of our stores, I have this data set with 35 points, from January 2010 to November 2012 with sales in

salesData = {5.14, 5.32, 6.04, 5.84, 6.09, 6.03, 5.79, 6.26, 5.91, 6.44, 6.54, 7.76, 6.24, 6.19, 6.37, 6.72, 6.72, 6.52, 6.64, 6.96, 6.51, 7.03, 6.79, 8.11, 6.82, 6.96, 7.85, 7.68, 7.80, 7.80, 7.80, 8.22, 8.19, 8.67, 8.29}


If I plot it with DateListPlot as below:

DateListPlot[salesData
,{2010,1}
,Joined-> True
,AspectRatio->0.2
,DateTicksFormat->{"MonthShort","/","YearShort"}
,PlotLabel->Style["Sales Chart",18,Bold,Blue]
,ImageSize->800
]


I get:

My question is:

How do I use SARIMAProcess, TemporalData and TimeSeriesForecast to get the forecast and the prediction band with some confidence interval as in this picture?

In this case, the series shows seasonality by year and this is the reason I know that the S in (S)ARIMA is necessary.

I'm new to time series, so if possible, I would like to have didactical answer. I am vague on the meaning of the SARIMA coefficients and how to determine them.

Are you missing comma between between -3 and -4 elements in salesData ? – Vitaliy Kaurov – 2012-12-16T03:05:19.470

This is not my area of expertise but I do have some interest in this for something I plan to work on next year. I don't have 9 installed though. Have you tried mimicking the Lake Mead example in the docs? – Mike Honeychurch – 2012-12-16T03:12:03.627

BTW does anyone know how the seasonal function(s) in Mma compare to the US Census Bureau X-12-ARIMA seasonal adjustment software -- which seems to be very commonly used, if not the standard for seasonal adjusting. – Mike Honeychurch – 2012-12-16T03:13:57.213

@VitaliyKaurov tks, list corrected. – Murta – 2012-12-16T03:28:04.660

@MikeHoneychurch Yes! But I don't know from where come the model part SARIMAProcess[{.8}, 0, {-.4}, {12, {.2}, 1, {.3}}, 4.12]. – Murta – 2012-12-16T03:29:39.483

Now with corrected list you still have an very high spike 19 at -3 element. Are you sure it is not fluke or error? I cannot reproduce your graph with that 19. Should it be 9? – Vitaliy Kaurov – 2012-12-16T03:30:59.217

@VitaliyKaurov tks again. The posted the data again. Now it's ok. – Murta – 2012-12-16T03:35:14.400

@Murta yes I see now. The docs example just pulls that out of nowhere. – Mike Honeychurch – 2012-12-16T03:53:50.847

It looks quite poorly documented actually. The seasonal order in the example is 12 so I guess that is an integer number of points across the seasonal cycle but I can't find where this is explicitly stated. – Mike Honeychurch – 2012-12-16T04:00:09.430

In the docs for SARMAProcess there are some examples of estimating the parameters. Maybe that is what they did in this example but left out that step as an oversight that wasn't picked up in the QA. http://reference.wolfram.com/mathematica/ref/SARMAProcess.html

– Mike Honeychurch – 2012-12-16T04:05:49.447

If anyone has interesting links explaining SARIMA they are welcome too. – Murta – 2012-12-16T11:10:45.653

7

After some study, I think that I found out how to answer it using:

data = TemporalData[salesData,{{2010,1},{2012,11},"Month"}];
proc=EstimatedProcess[salesData,SARIMAProcess[{},1,{},{12,{a},1,{b}},v]];
forecast=TimeSeriesForecast[proc, data,{14}];

DateListPlot[N@{data["Path"],forecast["Path"]}
,AspectRatio->0.2
,Joined-> True
,PlotStyle -> Thick
]


For the error band I used:

errors=forecast["MeanSquaredErrors"];
bound=Sqrt[Last[proc]] Sqrt[errors["PathStates",1]] Quantile[NormalDistribution[],1-1/2 (1-.95)];
bounds=TemporalData[{forecast["PathStates",1]-bound,forecast["PathStates",1]+bound},{{forecast["Times"]},{forecast["Times"]}}];

DateListPlot[N@{data["Path"],forecast["Path"],Sequence@@bounds["Paths"]}
,Joined-> True
,AspectRatio->0.2
,Filling->{3->{{4},LightRed}}
,PlotStyle->{Thick,Blue,Sequence@@ConstantArray[Darker[Red],3]}
]


Now I have just to understand better the meaning of the SARIMAProcess terms. Why I use SARIMAProcess[{},1,{},{12,{a},1,{b}},v] instead of SARIMAProcess[{p},1,{q},{12,{a},1,{b}},v] or something else, I still don't know. But it's a Math problem, not a Mathematica one.

@rselva Here are some relations between Holt and ARIMA

– Murta – 2014-07-05T01:51:57.660

Someone know how to easily put date ticks with month and year as the original? I have some difficult. – Murta – 2012-12-18T02:02:46.207

it looks like your temporal data has no times! Consider news = {DateList[{2010, #, 0, 0, 0, 0}], salesData[[#]]} & /@ Range[Length@salesData]; newt = TemporalData[news] DateListPlot[newt["Paths"]] – dwa – 2012-12-18T02:25:54.937

@dwa Tks! Corrected. – Murta – 2012-12-18T14:52:42.273

SPSS 'Expert Modeler' gives Holt model as the best model with your data. The stationary R ^2 Value was 0.837 However I have no idea about using Holt model in Mathematica – rselva – 2012-12-21T08:26:31.540

@rselva tks. I'm still studying how to perform it on Mathematica. :) – Murta – 2012-12-21T08:32:25.430