## Performance on Multinomial Deviate is slow

3

Most of the time in my program is spent on this single line of code:

out = RandomVariate[MultinomialDistribution[nt, tmp]];


tmp is a vector of np+1 probabilities which sum to 1. nt is an integer. out is an np-component vector of integers that sums to nt.

pseudo-code:

Loop:

Get new np and nt values
set up tmp.
out = RandomVariate[MultinomialDistribution[nt, tmp]];

Do stuff with out
End Loop


Here is an example with timing results. In the real problem nt and the length of tmp are random but this example is enough to illustrate the difficulty. Is there an obvious way to improve the performance?

tmp = {.1, .1, .8}
nt = 2;
timer = TimeUsed[];
For[i = 1, i < 10000, i++,
(* tmp may be changed here *)
out = RandomVariate[MultinomialDistribution[nt, tmp]];
];
Print["time in loop=", TimeUsed[] - timer];


Sorry if this has been discussed before.

Please take a moment to format the question for readability and put the code in code blocks. Just click the edit link below the question and the formatting help will appear on the right. (What you need to do is select the code and press the code button above the edit box). – Szabolcs – 2014-03-17T18:57:59.727

Calling RandomVariate for results one at a time is a bit brain-dead. If you're going to need a set of variates, generate them at one go and use that result. E.g., for your example of 10K variates, there's a ~500 to 1 speed difference. – ciao – 2014-03-17T21:38:33.603

@jep - sorry, btw, did not mean you are brain-dead, meant brain-dead code :-) – ciao – 2014-03-17T21:46:02.400

1no worries rasher. I'm in a big loop (~10000) and each time through I need one multinomial variate. Each time through the loop nt takes on different values and tmp is a different length. There might be a few hundred distinct nt,tmp pairs. I can count the number of variates I need for each distinct nt,tmp pair and store it. It would be worth the effort I'm sure. So far everything I've thought of is pretty messy. – JEP – 2014-03-17T22:09:28.840

1I edited your post to clarify a bit. Unfortunately I don't think there's much you can do to speed this up if the parameters of the distribution are changing between each invocation. – Szabolcs – 2014-03-17T22:24:20.123

1Some unrelated comments: as a beginner, just don't use For. It's generally better to use functional constructs like Table. When you can't, use Do. It's more readable than For, less error prone (iterator is localized), and marginally faster. Instead of TimeUsed, use the function Timing or AbsoluteTiming. – Szabolcs – 2014-03-17T22:26:11.737

It's a pity you can't use RandomVariate to generate thousands of values. On my PC, time needed to generate those random values varies as 1 + 0.002*n (ms) with n the number of values to be generated. As you can see, setting up the random generator for the first number takes relatively long (though only a measly ms). – Sjoerd C. de Vries – 2014-03-17T22:39:53.347

@JEP: How wide is the range of 'nt', how many distinct values does it take on typically, and is the probability vector more varied than it? If it's not too crazy, you can quadruple the speed of generation by precomputing some RandomChoice functions. – ciao – 2014-03-18T08:33:50.420

The probability vector, tmp, can be considered to be a function of np and is easy to compute. np and nt both seem to run from 1 to 10-12. I've created an array that contains the number of times that I need each different np,nt pair. This will reduce the number of times I need to call the random number generator. I don't actually need the output vector. I just need the number of zero elements in each vector. – JEP – 2014-03-18T18:19:44.370

2

It took two days and required me to think which is, of course, disgusting, but I found a solution. The original pseudo-code loop (see above) was a loop over things that I think of as "boxes" (not mentioned in the original pseudo-code). The number of boxes has typically been 8,000 but I can envision problems for which it might be some millions. What I needed was essentially the number of zeros in each returned multinomial variate. For example if the variate was (1,2,0,0,5) the result to be stored would be "2" because there are two zeros in that variate. (THis is slightly neutered for simplicity.)

Now instead of running a single loop over all boxes I run a double loop over all (np, nt) pairs. The variable "boxes" contains a list of the indices for the current (np,nt) pair. The variates (called "out" above) are stored in er, "variates" and the random number generator returns Length[boxes] results instead of just 1 as in the original version. The results are stored in results[[boxes]] . I haven't done any timing yet but the original loop (over boxes) took about 5-6 seconds to run and the double loop version runs in a small fraction of a second.

Create lists of all np and nt values and store in "nps" and "nts".
Create function vec[np] which returns the probability vector (called tmp in the original).
Initialize result=ConstantArray[0, Nboxes]

Loop over all (np,nt) pairs (up to Max[nps] and Max[nts]):

boxes = Flatten[Position[Transpose[{nps, nts}], {np, nt}]];
variates =
RandomVariate[MultinomialDistribution[nt, vec[np]],
Length[boxes]];
result[[boxes]] = Map[Count[#, 0] &, variates];


Arg. I don't know if these things get bumped when they're edited. In any case I've realized that I made an error in my previous solution. I am more or less back where I started. I will rewrite a clearer version of the original pseudo-code below.

Loop over boxes from 1 to Nboxes;

Get new np and nt values
vec=vec[np,boxes];
out = RandomVariate[MultinomialDistribution[nt, vec[np,boxes]]];

Do stuff with out
End Loop


The new wrinkle here is that vec depends on the box index. The solution above assumed that vec=vec[np] but not vec=vec[np,box]. The box-dependence of vec kills that solution. Is there a way to compile the random variate code that will improve performance? Most of my tests are on cases where Nboxes=8000 and it takes 5-6 seconds to run through the loop. I'd like to be able to increase Nboxes to millions. Anyway, if this code were in C there wouldn't be a penalty for the repeated calls to the RandomVariate thing. I don't know how to get this to even compile currently.

0

I am confused at what is the aim but I post in the aim of facilitation.

If the aim is sampling from a fixed multinomial distribution with fixed probability vector and counting the number of 'empty boxes' in sample:

Count[#, 0] & /@
RandomVariate[MultinomialDistribution[10, {1/2, 1/4, 1/4}], 100]


is example 10 boxes, with probabilities {1/2, 1/4, 1/4} and as rasher observes you can choose whatever sample size you want:100, ...

If, however, as seems to be alluded to number of boxes is a random variable and there is a well defined function mapping number of boxes to probability vector I present a toy model for facilitation: let there be between 5 and 10 boxes (uniformly distributed) and the probabilities remain uniform, i.e. 8 boxes $\implies p_j =1/8, j\in\{1,2,...,8\}$. Counting the number of empty boxes with this setup:

rnum = RandomInteger[{5, 10}, 100];
rprob = ConstantArray[1/#, {#}] & /@ rnum;
Count[#, 0] & /@
RandomVariate[MultinomialDistribution[#1, #2]] &, {rnum, rprob}]


The number of boxes is constant and equals Nboxes. Each box has np particles and nt traps. While np and nt are known for each box they vary randomly between boxes. I am not sampling from a fixed multinomial distribution. For all boxes, np and nt are stored in nps and nts which are both lists of integers wit Nboxes entries. The number of variates I need for a particular (np, nt) pair is the number of boxes that contain that pair value. e.g. if there are 57 boxes with (np,nt)=(3,5) I need to draw 57 multinomial variates and then count & store the number of zeros in each of those 57 boxes. – JEP – 2014-03-21T17:24:10.490

I need the actual variate out because I need more than just the number of zeros. The actual code that is working and represents a major performance improvement is arg (unreadable code deleted). I guess I can't indent code in a comment? – JEP – 2014-03-21T17:31:26.557

I guess my previous comment didn't make it clear that I need to do draw multinomial variates for all (np, nt) values. So if there were 57 boxes that had (np,nt) = (3,5) there might have been 117 boxes that had (np,nt)=(4,7) 253 boxes with (np,nt)=(2,3) etc. – JEP – 2014-03-21T17:56:07.083

@JEP apologies for not understanding...time poor at present...thank you for clarification – ubpdqn – 2014-03-22T00:36:29.563

Actually ubpdqn, your answer does have the correct complexity. It fails by virtue of it's lousy performance. On 8,000 boxes your solution (which I now see is equivalent to my second answer (below) takes about 10 seconds on my laptop. I can generate the same number of deviates in C in 0.01 seconds. – JEP – 2014-03-27T15:43:43.260

0

Here is another answer which fails because it is too slow. The generation of the deviates takes about 10 seconds on my laptop. In C I can generate the same number of multinomial deviates in about 0.013 seconds. The code below does have the correct complexity which is why I am posting it. I am posting this here so that I can link to it from another question which I will ask. Namely how can I call my C multinomial deviate generator from Mathematica?

Nboxes = 8000;
p = RandomReal[{0, .01}, Nboxes];
vecn[n_] := Module[{u},
u = ConstantArray[p[[i]], n + 1];
u[[n + 1]] = 1 - n p[[i]];
Return[u];
];

rt = RandomInteger[{0, 10}, Nboxes];
rp = RandomInteger[{0, 10}, Nboxes];
vvs = Table[vecn[rp[[i]]], {i, Nboxes}];


Up to here everything is set up. The above is a toy version of my real problem. The distributions used above are all uniform. The ones in my actual problem come from simulation. In terms of complexity though the above is equivalent to my real problem.

F[x_, y_] := RandomVariate[MultinomialDistribution[x, y]]
out = MapThread[F, {rt, vvs}]; // Timing


I hadn't understood ubpdqn's post above which is cleaner than mine. It doesn't work because it is too slow. I could have simply done:

out = MapThread[
RandomVariate[MultinomialDistribution[#1, #2]] &,            {rt, vvs}]; // Timing
 – JEP  – 2014-03-27T15:53:12.703