Working with large data, pass-by-reference, compiled functions



I think I know the answer to this, but after two days of Googling, this is my last effort before going down an unwanted path...

I have a function that takes lots of large data as arguments as well as assigns new values to one of the arguments in place. Here is what it looks like:

    LinearIsotropicElastic[youngMod_,poisson_,node_,dilitation_,devElongState_,families_,volumes_,refMagState_,volState_,internalForce_] :=
Module[{ti,td,forceState,bulkMod=youngMod/3.0/(1.0 - 2.0*poisson),shearMod=youngMod/2.0/(1.0 + poisson)},

    ti = 3.0*bulkMod*dilitation[[node]]/volState[[node]]*refMagState[[node]];

    td = 15.0*shearMod/volState[[node]]*devElongState[[node]];

    forceState = ti + td;

    internalForce[[node]] += Total[forceState*volumes[[families[[node]]]]];

    internalForce[[families[[node]]]] -= forceState*volumes[[node]];

In pure Mathematica, I used SetAttributes[]=HoldAll and everything works fine. The issue is I need to map this function over many thousand "node" and this takes forever, therefore I'd really like to compile this function. I have not been able to successfully get the tricks that have been offered to compile a function while changing one of its arguments. This is standard pass-by-reference stuff in C of course. Are there any suggestions before I resort to writing this function in C and LibraryLink'ing it in? I really don't want to go down this path, because I actually already have a working similar application in C++ and was building this in Mathematica to be a prototyping code and not deal with all the complexity of C/C++. Any help will be appreciated.


Posted 2012-03-07T02:03:00.010

Reputation: 393

2Why don't you post a minimal self-contained example? This code is taken out of the context, and it is hard to say what can be the best optimization strategy for it. By looking at it, it will only minimally benefit from compilation, because it does not contain loops, and the function is quite light-weight (does only a few arithmetic operations). While direct pass-by-reference is not supported by Compile at present, there are several work-arounds, but without a self-contained example (with large sample data, perhaps randomly generated), it is hard to say what will work best. – Leonid Shifrin – 2012-03-07T06:04:21.093



As Leonid already commented, your code should not be especially slow. However one reason why your code may be slow is that you use Part to extract values, do some calculations and then insert the result. I would try using a wholesale approach and calculate all nodes at the same time by doing, e.g. (may need tweaking since I don't know the structure of your objects),

ti = 3.0*bulkMod*dilitation/volState*refMagState;
td = 15.0*shearMod/volState*devElongState;
forceState = ti + td;

internalForce += Total /@ forceState*volumes[[families]];
internalForce[[families]] -= forceState*volumes;

This way you do everything at once. I can try and help some more if you post examples of the data structures, especially what families and volumes return.


Posted 2012-03-07T02:03:00.010

Reputation: 815

Yes, as Timo suggests a good strategy would be to vectorize your code not to work on one node but all of them. – None – 2012-03-07T08:02:57.660

3Instead if Total /@ you could use Total[blah,{2}] and operate on the proper level. – None – 2012-03-07T08:04:07.033

@ruebenko Interesting - does that actually speed things up, though? – Verbeia – 2012-03-07T22:24:40.800

1@Verbeia, I don't know about speed - but it would not unpack, so it would save memory, I assume. – None – 2012-03-08T06:54:54.803

Timo, good to see you posting again. – Mr.Wizard – 2012-03-09T02:18:37.870

Busy, busy life :-). – Timo – 2012-03-09T05:01:53.397


A well designed CompiledFunction might work well for you. One trick that can help a lot is when you might want to do something like the following:

data = Range[60];  
Do[data[[i]] = 0, {i, 4, 47, 5}];        

If you have to change a lot of values the next approach does the same thing much faster.

data[[{4, 9, 14, 19, 24, 29, 34, 39, 44}]] = 0;  

The later approach makes all the replacements in one step. Rob Knapp of Wolfram Research used this in a CompiledFunction for an efficient implementation of LUDecomposition. See the notebook at

The same notebook includes a number of other tips on writing efficent code. Also, don't forget the extensive tips on Compile at How to compile effectively? on this site and the tutorials listed in the documentation for Compile.

I just noticed Timo gave the tip about making multiple changes with Part all at once. I don't have enough time to be as thorough as I used to. Maybe the references I provided will help.

Ted Ersek

Posted 2012-03-07T02:03:00.010

Reputation: 5 036