How to find a model for a short discrete time-series?

3

I'm examining the activity of customers over the years which have about one event per year. This results is many short time-series for which I found the distributions (hit/miss over 4 years sorted by probability in the data):

0000 : 0.31834
0001 : 0.17582
0010 : 0.13605
0100 : 0.13554
1000 : 0.12886
0011 : 0.01717
1100 : 0.01650
0110 : 0.01578
0101 : 0.01220
1010 : 0.01117
1001 : 0.00883
0111 : 0.00571
1110 : 0.00565
1111 : 0.00496
1101 : 0.00384
1011 : 0.00351

Apparently a purely uncorrelated binomial model wouldn't do, but one can observe that if both, the number of 1's and 11's coincide, then the probabilities are approximately equal (apart from a small recency effect of 0001).

Can you see a way to approach such data to deduce a probabilistic model? Basically where I have only a few probability parameters which roughly explain this distribution?

Gerenuk

Posted 2014-10-31T08:50:23.160

Reputation: 345

Having trouble understanding your data set. What do the 1's and 0's correspond to. For example what does 1011 mean in plain english? – Ben Haley – 2014-10-31T16:52:03.127

1011 means: Customer booked 2010, 2011 and 2013, but not 2012. It's basically a short time-series and each digit indicates if the customer booked that year (2010,2011,2012,2013). – Gerenuk – 2014-10-31T17:51:43.593

Oh, of course I meant 1011 = booked 2010, 2012, 2013 and not 2011. – Gerenuk – 2014-10-31T19:25:41.157

Haven't you answered your own question? What you provide in the question is a probability distribution. Another approach: model the number of hits by averaging/weighting the individual values (eg, chance of 2 hits is approximately .015 or so) – None – 2014-11-03T17:59:57.047

I want to find a mechanism for this distribution. Therefore reduce all these parameters to only very few. For example just saying $P(1)=0.25$ wouldn't work since a binomial distribution doesnt match. However it almost does if you somehow include another parameter for consecutive hits. – Gerenuk – 2014-11-04T05:54:35.810

No answers