## Compilation of Total with compensated summation

11

3

I sometimes obtain an unexpected error when trying to call a compiled version of Total with compensated summation turned on. More specifically I define

f1 = Compile[{{x,_Real,1}},Total[x],CompilationTarget->"C"];
f2 = Compile[{{x,_Real,1}},Total[x,Method->"CompensatedSummation"],CompilationTarget->"C"];


When I call the functions via

x=RandomReal[{0,1},10^5]
f1[x]
f2[x]


I sometimes, but not always, obtain a CompiledFunction::cfte error, informing me that the

"Compiled expression [some large number of order $10^{5}$] should be a rank 1 tensor of machine-size real numbers."

I have studied the documentation of that error but have been unable to isolate the problem any further. The results of f1 and f2 differ by up to $1\times 10^{-9}$ and the difference becomes even larger for larger input vectors, so I'd quite like to be able to use the (presumably more accurate) version with compensated summation turned on.

Question: What causes the CompiledFunction::cfte error and how can I avoid it when compiling Total with compensated summation?

In case it matters I am using version 9.0 for Linux x86 (64-bit).

4Looks like it might be a bug - I note that repeated calls with same argument eventually result in return with no error. Strange. Note also, however, that calling Total with CompensatedSummation results in a jump out to MainEvaluate, so any performance benefits to compiling Total get quashed. – ciao – 2015-04-06T22:35:15.967

1

Compensated summation does indeed make a difference: http://mathematica.stackexchange.com/questions/62917/why-does-taking-advantage-of-listable-change-the-results-of-a-numerical-computat/62929#62929

– dr.blochwave – 2015-04-07T09:16:40.310

16

## Compilation of Total

As pointed out in a comment, it seems that the compensated summation form of Total can't be compiled. You can check this using CompilePrint - note the call to MainEvaluate.

Needs["CompiledFunctionTools"]
CompilePrint@f2  (* from your question *)


## Summation in Mathematica

There seem to be plenty of options for summing a list in Mathematica. For example, you can use Plus, which apparently will always do compensated summation, or Tr[], inspired by this question.

uncompiledTotal[x_] := Total[x];
compiledTotal = Compile[{{x, _Real, 1}}, Total[x], CompilationTarget -> "C"];
compensatedTotal[x_] := Total[x, Method -> "CompensatedSummation"];

plus[x_] := Apply[Plus, x];
compiledPlus = Compile[{{x, _Real, 1}}, Apply[Plus,x], CompilationTarget -> "C"];

uncompiledTr[x_] := Tr[x];


## Compiled Kahan summation

There is another option, of course, which is to make your own machine-precision compiled implementation of Kahan summation.

kahanSummation = Compile[{{x, _Real, 1}},
Module[{s, c, y, t, i},
s = 0.;
c = 0.;
For[i = 1, i <= Length@x, i++,
y = x[[i]] - c;
t = s + y;
c = (t - s) - y;
s = t;];
s],
CompilationTarget -> "C",
RuntimeOptions -> "Speed"]


## Comparison: Accuracy

Now let's test the performance of these options, first in terms of the actual result, compared to the built-in compensated summation method (which I assume to be correct).

x = N@RandomReal[{0, 1}, 10^6];
TableForm[MapThread[{#1, NumberForm[Abs[#2 - compensatedTotal[x]], 16]} &,
{{"Uncompiled total", "Compiled total", "Compensated summation",
"Kahan summation", "Uncompiled plus", "Compiled plus", "Uncompiled Tr"},
{uncompiledTotal[x], compiledTotal[x], compensatedTotal[x],
kahanSummation[x], plus[x], compiledPlus[x], uncompiledTr[x]}}],
TableHeadings -> {None, {"Function", "Error"}}]


Clearly the behaviours of both Total and Plus depend on whether they are compiled or not. The key thing to note here is that the above compiled implementation of kahanSummation[] gives the same result as Mathematica's own compensated summation approach.

Graphically, we can use the code from ybeltukov's answer here to look at the residuals. See if you can spot the difference between my compiled Kahan method and the built-in!

## Comparison: Speed

Finally, it's also worth comparing the timings of each of the methods above, since that's typically a reason to use Compile[] in the first place.

The compiled implementation of kahanSummation[] is much faster than either Plus or Total with compensated summation, and the extra operations it utilises are only a small performance hit compared to the built-in implementation of Total.

3Very interesting. I must admit I'm surprised that Total doesn't use compensated summation by default, given that it is an RTL call from the VM, and that it is so much slower than the straightforward implementation. – Oleksandr R. – 2015-09-03T11:35:15.207

@OleksandrR. indeed, I was surprised it was so much slower (20x?) when compensated summation is turned on. Also curious about the changes when Plus and Total` are compiled. – dr.blochwave – 2015-09-03T11:39:18.760