21

6

I am trying to find the fastest way to calculate two values where both of them are sum of different expressions. I combine both calculations in one `Sum[]`

. I'm compiling into "C".
Here are the test functions I have:

```
f = Compile[{{n, _Integer, 0}}, Module[{a},
Sum[Module[{}, a = Sin[-0.001 i^2]; {i*a, a}], {i, 1, n}]],
CompilationTarget -> "C", RuntimeOptions -> "Speed"];
g = Compile[{{n, _Integer, 0}}, Module[{a},
Sum[{i*Sin[-0.001 i^2], Sin[-0.001 i^2]}, {i, 1, n}]],
CompilationTarget -> "C", RuntimeOptions -> "Speed"];
h = Compile[{{n, _Integer, 0}}, Module[{a},
Sum[(a = Sin[-0.001 i^2]; {i*a, a}), {i, 1, n}]],
CompilationTarget -> "C", RuntimeOptions -> "Speed"];
q = Compile[{{n, _Integer, 0}}, Module[{},
{Sum[Sin[-0.001 i^2]*i, {i, 1, n}],
Sum[Sin[-0.001 i^2], {i, 1, n}]}], CompilationTarget -> "C",
RuntimeOptions -> "Speed"];
q2 = Compile[{{n, _Integer, 0}}, Module[{},
{Table[Sin[-0.001 i^2]*i, {i, 1, n}] // Total,
Table[Sin[-0.001 i^2], {i, 1, n}] // Total}],
CompilationTarget -> "C", RuntimeOptions -> "Speed"];
nc[n_] := {Sum[Sin[-0.001 i^2]*i, {i, 1, n}],
Sum[Sin[-0.001 i^2], {i, 1, n}]};
Benchmark[f_, n_] := Timing[f[n]];
TableForm[
Flatten /@ Table[Benchmark[fun, 10000], {fun, {f, g, h, q, q2, nc}}],
TableHeadings -> {{"f", "g", "h", "q", "q2", "nc"}, {"Timing",
"Result"}}]
```

I expect the function `h`

to be the fastest because I'm reusing an expensive calculation of `Sin`

or, if compiler is smart enough to implement the reuse, approximately same speed from all three.
Instead functions `q`

and `q2`

are the fastest and `g`

is way faster than the other compiled versions, with following results:

```
Timing Result
f 0.026824 104486. -34.6114
g 0.000782 104486. -34.6114
h 0.020543 104486. -34.6114
q 0.000597 104486. -34.6114
q2 0.000628 104486. -34.6114
nc 0.001784 104486. -34.6114
```

Why is this happening? My guess is evaluation escapes from compiled body, but why?

**Update**

Big thanks to halirutan for a good answer! For completeness I added the non-compiled version of his function `fHal`

```
fHalNoC[n_] :=
With[{r = Range[n]}, Total /@ ({r*#, #} &[Sin[-0.001 r^2]])];
```

Then with a slightly modified benchmark function:

```
testRange = 10^# & @{3, 4, 5, 6};
Benchmark[f_, n_] :=
With[{results = Table[First@AbsoluteTiming[f[n]], {20}]},
Mean[results]];
TableForm[
Table[Benchmark[fun, n]/
n, {fun, {f, g, h, q, q2, nc, fHal, fHalNoC}}, {n, testRange}],
TableHeadings -> {{"f", "g", "h", "q", "q2", "nc", "fHal",
"fHalNoC"}, testRange}]
```

I got following results (I normalized timing over list length):

I guess my lesson learned: even non-compiled version that utilizes `Listable`

is faster than my timid attempts to tune with compilation.
Full code available here.

4Have you considered looking at the compiled code using

`Needs["CompiledFunctionTools`"]; CompilePrint[h]`

? If you want to know what's going on after compiling there will be no way around a careful inspection of the created code. – halirutan – 2013-06-25T00:06:55.507Will do. Thanks again. – BlacKow – 2013-06-25T00:18:07.870