## Speeding up random walk for many particles

16

6

I am trying to speed up this code for many particles to take a random walk. I'm not sure why it is so slow for such a simple task.

I got a few hints from colleagues to reduce the precision of the calculations, compile some of the functions, parallelize the code, or implement C code. I tried to implement each of these, but I most likely don't know what I'm doing because I couldn't get it to work. The particles can and should be simulated in parallel to take advantage of the two cores.

I have a bunch (10,000) particles that each take 1000 fixed steps in random directions. I have a bunch of lists with the length of number of particles, and I operate on those lists. This is faster than doing a nested loop, where I first loop over every particle and then loop over each time step. Two random angles are generated for each particle at each time step. Each particle's position is then incremented. I also want to know how many times each particle has crossed a spherical shell. The history of each particle is important: eventually, I will distinguish particles by the number of times they cross the shell so that they take bigger steps the more they have crossed.

I tried timing each step, and it seems like the trig functions slow things down. Is there any way to speed this up? Is there a better way to write this code to make it much faster? Eventually, I would like to add some more complications to this code and run it for many more time steps and particles. So, any general speed tips would help.

For comparison, the absolute timing of the Do loop is 3 seconds on my laptop with Core i5 M 540 @ 2.53 GHz, with 4 GB RAM.

numparticles = 10^4;
numsteps = 10^3;
particlesx = Table[1.001, {numparticles}];(*x-coordinate for each particle*)
particlesy = Table[0., {numparticles}];(*y-coordinate for each particle*)
particlesz = Table[0., {numparticles}];
numcrossings = Table[0., {numparticles}];(*counts the number of crossings for each particle*)

Do[
\[Theta]rand = RandomReal[{0., Pi}, numparticles];(*random polar angle for each particle*)
\[Phi]rand = RandomReal[{0, 2.*Pi}, numparticles];(*random azimuthal angle for each particle*)
rold = particlesx^2 + particlesy^2 + particlesz^2;(*original distance from origin for each particle*)
particlesx = particlesx + 0.01*Sin[\[Theta]rand]*Cos[\[Phi]rand];(*update each particle's position*)
particlesy = particlesy + 0.01*Sin[\[Theta]rand]*Sin[\[Phi]rand];
particlesz = particlesz + 0.01*Cos[\[Theta]rand];
rnew = particlesx^2 + particlesy^2 + particlesz^2;(*new distance from origin*)
isitoutside = (rold - radius)*(rnew - radius);(*yields negative number if the particle crossed the sphere, else positive*)
switchsides = Map[If[# < 0, 1, 0] &, isitoutside];(*if the particle crossed, it increments numcrossings*)
numcrossings = numcrossings + switchsides;
, {step, 1, numsteps}]; // AbsoluteTiming


Not an answer to the question, but perhaps it's good to mention that a simple and fast way to generate an n-dimensional (or if you wish: n-particle) random walk with m steps is Accumulate@RandomReal[{-1,1}, {m, n}]. – Szabolcs – 2012-05-12T10:48:59.403

13

This will give a modest improvement. I'm probably missing a few more though.

One other remark: this way of choosing a "random" direction is far from uniform.

numparticles = 10^4;
numsteps = 10^3;
particles = ConstantArray[{1.001, 0., 0.}, numparticles];
rnew = Map[#.# &, particles];
numcrossings = ConstantArray[0., numparticles];

In:=
runSim = Compile[{{numsteps, _Integer}, {radius, _Real}, \
{origparticles, _Real, 2}},
Module[
{numparticles, thetarand, phirand, rold, rnew, next,
particles = origparticles, isitoutside, switchsides,
numcrossings},
numparticles = Length[particles];
numcrossings = ConstantArray[0., numparticles];
rnew = Map[#.# &, particles];
Do[thetarand = RandomReal[{0., Pi}, numparticles];
phirand = RandomReal[{0, 2.*Pi}, numparticles];
rold = rnew;
next =
Transpose[
0.01*{Sin[thetarand]*Cos[phirand], Sin[thetarand]*Sin[phirand],
Cos[thetarand]}];
particles = particles + next;
rnew = Map[#.# &, particles];
switchsides = Clip[isitoutside, {0, 0}, {1, 0}];
numcrossings = numcrossings + switchsides;
, {numsteps}];
Round[numcrossings]
], CompilationTarget -> "C"];

In:= Timing[nc = runSim[10^3, radius, particles];]

Out= {1.53, Null}


Thanks. Apparently, I have to setup a C compiler, so I'll be looking into that. Could you point me toward a better way to pick a more uniform distribution of random angles? – Paul – 2012-05-11T23:24:46.077

1

@Paul perhaps this helps

– Rojo – 2012-05-11T23:30:56.140

1

@Paul, also, check this answer and the function randomVectorOnUnitSphere

– Rojo – 2012-05-11T23:35:32.483

15

As Daniel points out, you will always get a good speed-up by compiling. I want to take a different approach and consider how you can get a faster execution by writing your code in a more "functional" style.

My machine is slower than yours and ran your code in 17.2 seconds, according to AbsoluteTiming.

The first thing I noticed is that you are creating and overwriting the same variable definitions a lot in the body of your loop. This often slows things down.

The first thing to consider is creating all the random numbers in a single tensor instead of as two vectors that are re-defined each time.

starter =
Transpose[{RandomReal[{0., Pi}, {numsteps, numparticles}],
RandomReal[{0, 2.*Pi}, {numsteps, numparticles}]}, {2, 1,
3}]; // Timing

{0.885516, Null}


This creates a numsteps by 2 by numparticles tensor.

I'm going to set your loop up as a Fold that starts with the starting values particlesx etc and "folds" in the random numbers in each row of starter to update the values determined in the previous step.

The other thing I've done is change the function to determine if a crossing has taken place to use Sign rather than If in a Map. There is in any case a function Boole which does the If[something, 1, 0] construct, and it's Listable. Unfortunately Greater and Less aren't Listable so that's not an option.

Fold[With[{newx = #1[] + 0.01*Sin[#2[]]*Cos[#2[]],
newy = #1[] + 0.01*Sin[#2[]]*Sin[#2[]],
newz = #1[] + 0.01*Cos[#2[]],
rold = #1[]^2 + #1[]^2 + #1[]^2},
{newx, newy, newz,
0.5 - 0.5 Sign[(rold - radius)*(newx^2 + newy^2 + newz^2 -
{particlesx, particlesy, particlesz, numcrossings}, starter]; // AbsoluteTiming

{11.741202, Null}


Just this change in programming style gives a 30% speed-up, without compiling.

An alternative would be to use UnitStep in place of Sign, like this, but on my machine this gives only a small speedup:

Fold[With[{newx = #1[] + 0.01*Sin[#2[]]*Cos[#2[]],
newy = #1[] + 0.01*Sin[#2[]]*Sin[#2[]],
newz = #1[] + 0.01*Cos[#2[]],
rold = #1[]^2 + #1[]^2 + #1[]^2}, {newx, newy, newz,
UnitStep[-(rold - radius)*(newx^2 + newy^2 + newz^2 -
radius)] + #1[]}] &, {particlesx, particlesy,
particlesz, numcrossings}, starter]; // AbsoluteTiming

{15.007808, Null}


You might also be able to speed up the loop by constructing starter as the pre-calculated trig values

trigstarter =
0.01*Transpose[
With[{theta = RandomReal[{0., Pi}, {numsteps, numparticles}],
phi = RandomReal[{0, 2.*Pi}, {numsteps, numparticles}]},
{Sin[theta]*Cos[phi], Sin[theta]*Sin[phi], Cos[theta]}], {2, 1, 3}]; // AbsoluteTiming

{3.010354, Null}


And then the Fold is even faster:

result=Fold[With[{newx = #1[] + #2[], newy = #1[] + #2[],
newz = #1[] + #2[],
rold = #1[]^2 + #1[]^2 + #1[]^2}, {newx, newy, newz,
0.5 - 0.5 Sign[(rold - radius)*(newx^2 + newy^2 + newz^2 -
{particlesx, particlesy, particlesz, numcrossings}, trigstarter]; // AbsoluteTiming

{2.495747, Null}


This is a major speedup. By taking the trigonometric functions out of the loop, you are only pre-compiling them internally once.

You can then do things like:

Histogram[Last@result] If you are concerned with the amount of time taken to create the starter or trigstarter tensors, it's worth noting that you can break the number of steps up, run a Fold, save the result, and then use that result as the start of a new Fold. You could even run this as a Nest enclosing the Fold.

Just a note: the code creating "starter" and "trigstarter" seem to have wildly varying Timings and AbsoluteTimings for reasons I don't understand. This could be because I have a huge amount of other stuff open right now. – Verbeia – 2012-05-12T00:28:22.480

perhaps that's related to this?

– Rojo – 2012-05-12T01:20:25.537

@Rojo how come you remember my questions better than I do? :-D – Verbeia – 2012-05-12T01:42:30.690

Haha, totally inadvertent – Rojo – 2012-05-12T05:37:04.867

I also notice a nice speed boost, but if I increase either numsteps or numparticles by a factor of 10, it seems to eat up all my memory (I have 4 GB, with at most 3 free) and hang indefinitely, requiring shutting down Mathematica. Any ways around this? I would like the flexibility to be able to go up to 100k steps, perhaps more, without reducing the number of particles. I should also note that I do not need high precision on the particle positions. Does reducing the precision speed it up much? – Paul – 2012-05-23T22:36:53.580