Efficiently collecting results inside a compiled function

23

10

When we don't know the number of results that will be generated, the usual way to collect results is Reap/Sow. Another alternative is linked lists. Neither of these are available in compiled functions.

AppendTo does work, but it has an $O(n^2)$ complexity, so it will be unacceptably slow for long result lists.

There was a very clever suggestion to use Internal`Bag in these situations:

Unfortunately a compiled Bag will only hold scalars. Let's try to use vectors:

cf = Compile[{}, 
  Module[{bag = Internal`Bag[{{0, 0, 0}}]}, 
   Do[Internal`StuffBag[bag, {i, i, i}], {i, {1, 2, 3}}]; 
   Internal`BagPart[bag, All]
  ]]

CompilePrint shows that this calls MainEvaluate, so this does not work.

What is the best way to collect a large number of results in a compiled function when the number of results is not known before the computation and the result type is not a (fixed size) vector or matrix?


Benchmarking the answers

  • Andy's answer

    cf = Compile[{len},
           Module[{bag = Internal`Bag[Most[{0}]]}, 
             Do[Internal`StuffBag[bag, {i, i, i}, 1], {i, len}];
             Partition[Internal`BagPart[bag, All], 3]
           ]
         ];
    
    Do[cf[500000], {100}]; // Timing
    
    (* ==> {2.87, Null} *)
    

    I needed to initialize the Bag using bag = Internal`Bag[Most[{0}]] to let Compile know that it is holding integers, not reals (see here).

  • Leonid's answer

    cf2 = 
      Compile[{len},
       Module[{arr, lim, ctr},
        arr = ConstantArray[{0, 0, 0}, 10];
        lim = Length[arr];
        ctr = 1;
        Do[
         If[ctr == lim,
          arr = Join[arr, Table[{0, 0, 0}, {lim}]];
          lim = Length[arr]];
         arr[[ctr++]] = {i, i, i},
    
         {i, len}
        ];
        Take[arr, ctr - 1]
       ]
      ];
    
    Do[cf2[500000], {100}]; // Timing
    
    (* ==> {16.474, Null} *)
    

Comparing the computational complexity of the two solutions by direct measurement:

data = Table[
   {Round[2^k], First@AbsoluteTiming@Do[cf[Round[2^k]], {100}]}, 
   {k, 13, 19, 1/2}];
data2 = Table[
   {Round[2^k], First@AbsoluteTiming@Do[cf2[Round[2^k]], {100}]}, 
   {k, 13, 19, 1/2}];

ListLogLogPlot[{data, data2}]

Mathematica graphics

(They're the same.)

Szabolcs

Posted 2012-02-17T20:07:45.320

Reputation: 213 047

1I think the reason cf2 performs poorly is that it is not completely compiled. There is still a call to MainEvaluate. – RunnyKine – 2015-06-19T23:54:57.187

Do you mean when the number of results is unknown and the type of each result is a fixed dimensional list? Or you're asking for the more general case of different lengths and dimensions of each result to be collected? – Rojo – 2012-02-17T20:17:34.673

@Rojo I am asking for the case when each result is the same type (a fixed dimensional tensor) but the number of results is not known beforehand. – Szabolcs – 2012-02-17T21:02:27.033

Answers

17

Assuming you know the dimensions of the pieces that you want to come out you can always add a second argument to Internal`StuffBag that indicates the rank of the elements going in. The result is still flat so you have to partition after the fact.

cf = Compile[{}, Module[{bag = Internal`Bag[]},
    Do[Internal`StuffBag[bag, {i, i, i}, 1], {i, {0, 1, 2, 3}}];
    Partition[Internal`BagPart[bag, All], 3]]];

Here I'm indicating that vectors will be going in to the bag. I would specify 2 for a matrix. Notice that it no longer calls MainEvaluate.

In[25]:= StringFreeQ[CompilePrint[cf], "MainEvaluate"]

Out[25]= True

In[20]:= cf[]

Out[20]= {{0., 0., 0.}, {1., 1., 1.}, {2., 2., 2.}, {3., 3., 3.}}

Andy Ross

Posted 2012-02-17T20:07:45.320

Reputation: 18 640

Hi, @AndyRoss. I still can not understand "indicates the rank of the elements going in". When outside Compile, we can directly stuff 1d list into a bag, and got a 2d list. For example, myBag = Internal`Bag[];Do[Internal`StuffBag[myBag, {i, i, i}], {i, 3}];Internal`BagPart[myBag, All]. Then why is it not working in Compile? In Compile[{}, Module[{myBag = Internal`Bag[]}, Do[Internal`StuffBag[myBag, {i, i, i}], {i, 3}]; Internal`BagPart[myBag, All]]]`, `Internal`StuffBag[myBag, {i, i, i}] can not be compiled. – matheorem – 2016-09-01T15:47:20.220

@matheorem By "can not be compiled" I assume you mean it makes a call to MainEvaluate because this compiles just fine for me otherwise. By adding the second argument indicating the dimension (I was sloppy with the work "rank") of the input it tells the compiler to expect a vector and does not require calling MainEvaluate – Andy Ross – 2016-09-01T16:19:13.617

@AndyRoss Yeah, I mean MainEvaluate. In http://mathematica.stackexchange.com/a/1028/4742 , Oleksandr R. did explicitly mentioned we can not stuffbag tensor directly into a bag. So bag in Compile is quite different from bag outside Compile. It seems that Bag in Compile must be flat while there is no such constraint outside Compile

– matheorem – 2016-09-02T01:17:44.820

@matheorem that is correct. It is an odd inconsistency – Andy Ross – 2016-09-02T03:18:36.997

1very useful information, +1. Cheating a bit though :) – acl – 2012-02-18T00:07:18.447

3Notice that this returns reals rather than integers. What is the best way to get it to work with integers (and also run faster)? @halirutan's trick of initializing the Bag as bag = Internal`Bag[Most[{0}]] does work, but I find it both ugly and confusing, so I was wondering if there is a better way (maybe through Compile's 3rd argument). – Szabolcs – 2012-02-18T08:10:01.890

1@Szabolcs there may well be another way but I'm unaware of it. I typically initialize the bag as you have if I want to avoid coercion to reals. – Andy Ross – 2012-02-18T17:53:30.453

8

As I mentioned in my recent answer, another alternative is to implement a version of a dynamic array inside Compile, as say arr = Table[{0,0},{10}] (collecting vectors of length 2, in this example) . Set up a variable (say lim) which gives the current size limit, initialize it to the initial size of the allocated array, and another one (say ctr) which counts the current maximal used position. Then, you can do something like

If[ctr==lim,
  arr = Join[arr,Table[{0,0},{lim}]];
  lim = Length[arr]
]

or, instead of doubling, use some other array expansion policy (other exponent, or additive, may depend on the problem). This adds one extra instruction (check) in your inner loop, but potentially saves you memory, and, unlike Internal`Bag, can be returned from Compile. This won't work when your results are lists of different lengths (dimensions), however.

Leonid Shifrin

Posted 2012-02-17T20:07:45.320

Reputation: 108 027

This is what I would do in C too. – Szabolcs – 2012-02-17T21:10:44.663

Can you please take a quick look at my update and verify that I did the benchmarking correctly? (Just to make sure the difference is not because of something I overlooked.) – Szabolcs – 2012-02-18T08:31:40.233

My point being that I don't really see why this would be slower than the Bag. I assume the Bag uses a similar technique internally. – Szabolcs – 2012-02-18T18:13:44.303

@Szabolcs Sorry for not replying earlier, I was away. Apparently, Internal`Bag is faster. The difference is less dramatic if you compile to C (because arrays indexing is penalized less), but it exists and is substantial. It is also very apparent in your test example, because no other computation takes place. In the method I suggested, the main time is spent in testing the overflow condition, but a sizable fraction of time also in array resizing. Given the apparent speed of Internal`Bag, it indeed looks like you will be better off with it, at least for this class of use cases. – Leonid Shifrin – 2012-02-18T21:30:36.433

1Leonid, congratulations on 6K rep. :-) – Mr.Wizard – 2012-02-19T04:45:04.693

@Mr.Wizard Thanks, and the same to you as well :) – Leonid Shifrin – 2012-02-19T13:42:01.710

Thank you. It appears I'm above you on this chart now which anyone can tell doesn't seem right. Maybe I should start looking for a place to put a Bounty. Either that or I have to start writing Leonid-grade answers. (o_o) – Mr.Wizard – 2012-02-19T13:48:47.710

1@Mr.Wizard No, this is correct. The rep measures the level of participation on the site, as well as the quality of the answers. There are people who are many times more knowledgable than me for example, but can'y afford to participate a lot, and have a lower rep. I don't think we should pay much attention to the rep - what matters is that the questions and answers are useful to others. You bring a lot to this site, so your rep is more than well-deserved. – Leonid Shifrin – 2012-02-19T13:59:05.230

1

I really appreciate that. I still would like to meet one of these "many times more knowledgeable" people you speak of. Or perhaps not.

– Mr.Wizard – 2012-02-19T14:09:08.440