## Why won't this compiled function be inlined?

8

I've been trying to use Compile to get some code to execute more quickly, because I will need to call it millions of times. I've been using the InlineCompiledFunctions option, but it doesn't seem to be working as I am using it.

First, I compile a simple function

Clear[prod1c, dBeta12];
prod1c = Compile[{{value, _Real}, {count, _Integer}},
Apply[Times, value - Range[count]],
RuntimeAttributes -> {Listable}, Parallelization -> True];


Then, I want to inline that compiled function into the next compiled function

dBeta12 =
Compile[{{k, _Integer}, {x, _Real}, {alpha, _Real}, {beta, _Real}},
Total[
(-1)^#*Binomial[k - 1, #]*prod1c[alpha, k - 1 - #]*
prod1c[beta, #]*x^(alpha - k + #)*(1 - x)^(beta - # - 1)
&[Range[0, k - 1]]
],
{
{Binomial[_, _], _Integer, 1},
{prod1c[_, _], _Integer, 1},
{Range[_, _], _Integer, 1}
},
RuntimeAttributes -> {Listable},
Parallelization -> True,
CompilationOptions -> {"InlineExternalDefinitions" -> True,
"InlineCompiledFunctions" -> True}
];

CompilePrint[dBeta12]


The results contain

15 T(I1)3 = MainEvaluate[ Hold[prod1c][ R1, T(I1)4]]

16 T(I1)4 = MainEvaluate[ Hold[prod1c][ R2, T(I1)0]]

So the first compiled function was not inlined. What am I missing?

8

Replacing the use of a pure function with a more procedural approach seems to solve that problem (here I replaced * with Times for better clarity):

dBeta12 =
Compile[{{k, _Integer}, {x, _Real}, {alpha, _Real}, {beta, _Real}},
Module[{range},
range = Range[0, k - 1];
Total[
Times[
(-1)^range,
Binomial[k - 1, range],
prod1c[alpha, k - 1 - range],
prod1c[beta, range],
x^(alpha - k + range),
(1 - x)^(beta - range - 1)
]
]
],
{
{Binomial[_, _], _Integer, 1},
{prod1c[_, _], _Integer, 1},
{Range[_, _], _Integer, 1}
},
RuntimeAttributes -> {Listable},
Parallelization -> True,
CompilationOptions -> {
"InlineExternalDefinitions" -> True,
"InlineCompiledFunctions" -> True
}
];

CompilePrint[dBeta12]


gives

During evaluation of In[25]:= Compile::cfinll: The CompiledFunction CompiledFunction[{10,11.1,5852},{_Real,_Integer},{{3,0,0},{2,0,0},{3,0,3}},{<<1>>},<<1>>,{{6,0,6},{6,4,1},{35,6,2,1},{6,4,3},{3,2},{36,1,3,2,1},{4,3,6,-1},{40,43,2,1,1,2,1,2},{41,257,3,0,0,2,1,2,3,1,1},<<1>>,{33,1,3},{6,4,5},{3,4},{37,1,5,3,2},{16,3,2,4},{7,4,3},{4,5,3,-3},{1}},Function[{value,count},Times@@(value-Range[count]),Listable],Evaluate] could not be inlined because its use requires threading with the Listable runtime attribute.

During evaluation of In[25]:= Compile::cfinll: The CompiledFunction CompiledFunction[{10,11.1,5852},{_Real,_Integer},{{3,0,0},{2,0,0},{3,0,3}},{<<1>>},<<1>>,{{6,0,6},{6,4,1},{35,6,2,1},{6,4,3},{3,2},{36,1,3,2,1},{4,3,6,-1},{40,43,2,1,1,2,1,2},{41,257,3,0,0,2,1,2,3,1,1},<<1>>,{33,1,3},{6,4,5},{3,4},{37,1,5,3,2},{16,3,2,4},{7,4,3},{4,5,3,-3},{1}},Function[{value,count},Times@@(value-Range[count]),Listable],Evaluate] could not be inlined because its use requires threading with the Listable runtime attribute.

Out[26]= "
4 arguments
11 Integer registers
5 Real registers
9 Tensor registers
Underflow checking off
Overflow checking off
Integer overflow checking on
RuntimeAttributes -> {Listable}

I0 = A1
R0 = A2
R1 = A3
R2 = A4
I1 = 0
I2 = -1
I10 = 12
I5 = 1
I9 = 3
Result = R4

1   I4 = I0 + I2
2   I6 = I1
3   I7 = Subtract[ I4, I2]
4   T(I1)0 = Table[ I7]
5   I8 = I2
6   goto 8
7   Element[ T(I1)0, I6] = I8
8   if[ ++ I8 <= I4] goto 7
9   T(I1)1 = Power[ I2, T(I1)0]
10  I3 = I0 + I2
11  T(I1)2 = MainEvaluate[ Hold[Binomial][ I3, T(I1)0]]
12  T(I1)3 = - T(I1)0
13  I6 = I0 + I2
14  T(I1)4 = I6 + T(I1)3
15  T(R1)3 = CompiledFunctionCall[ Hold[CompiledFunction[{value, \
count}, Times @@ (value - Range[count]), -CompiledCode-]][ R1, T(I1)4]]
16  T(R1)4 = CompiledFunctionCall[ Hold[CompiledFunction[{value, \
count}, Times @@ (value - Range[count]), -CompiledCode-]][ R2, T(I1)0]]
17  I3 = - I0
18  R3 = R1 + I3
19  T(R1)5 = R3 + T(I1)0
20  T(R1)6 = Power[ R0, T(R1)5]
21  R3 = - R0
22  R4 = I5
23  R4 = R4 + R3
24  T(I1)5 = - T(I1)0
25  R3 = R2 + I2
26  T(R1)7 = R3 + T(I1)5
27  T(R1)5 = Power[ R4, T(R1)7]
28  T(R1)7 = CoerceTensor[ I9, T(I1)1]]
29  T(R1)8 = CoerceTensor[ I9, T(I1)2]]
30  T(R1)7 = T(R1)7 * T(R1)8 * T(R1)3 * T(R1)4 * T(R1)6 * T(R1)5
31  R4 = TotalAll[ T(R1)7, I10]]
32  Return
"


So the execution of prod1c does not now require a MainEvaluate call, but another warning tells us that the inlining still did not work.

Looking for what causes this warning I got down to the following minimal not working example producing the same warning:

fc1 = Compile[{{k, _Integer}},
Times[k],
RuntimeAttributes -> {Listable}
];
fc2 = Compile[{{k, _Integer}},
fc1[Range[k]],
CompilationOptions -> {
"InlineExternalDefinitions" -> True,
"InlineCompiledFunctions" -> True
}
];

CompilePrint@fc2


So it seems that inlining does not like listable compiled functions. In this MNWE the fix is therefore trivial:

fc1 = Compile[{{k, _Integer, 1}},
Times[k]
];
fc2 = Compile[{{k, _Integer}},
fc1[Range[k]],
CompilationOptions -> {
"InlineExternalDefinitions" -> True,
"InlineCompiledFunctions" -> True
}
];

CompilePrint@fc2


which compiles and inlines fine.

Doing the same with your actual function we finally get:

prod1c = Compile[{{value, _Real}, {count, _Integer, 1}},
Apply[Times, value - Range[count]],
Parallelization -> True
];

dBeta12 =
Compile[{{k, _Integer}, {x, _Real}, {alpha, _Real}, {beta, _Real}},
Module[{range},
range = Range[0, k - 1];
Total[
Times[
(-1)^range,
Binomial[k - 1, range],
prod1c[alpha, k - 1 - range],
prod1c[beta, range],
x^(alpha - k + range),
(1 - x)^(beta - range - 1)
]
]
],
{
{Binomial[_, _], _Integer, 1},
{prod1c[_, _], _Integer, 1},
{Range[_, _], _Integer, 1}
},
RuntimeAttributes -> {Listable},
Parallelization -> True,
CompilationOptions -> {
"InlineExternalDefinitions" -> True,
"InlineCompiledFunctions" -> True
}
];

CompilePrint[dBeta12]


Which seems to compile and inline without problems. I cannot test if the result is what you would expect though. Also, I'm not sure why the pure function poses a problem here.

Thanks for digging into the problem. Unfortunately, and for reasons I don't understand, when prod1c is changed to accept a vector/list/tensor of rank 1, it does not behave as the original function with RuntimeAttributes->{Listable}, and although I thought the problem could be solved by adding a third argument to the Apply function, that didn't work for the compiled version. – FalafelPita – 2017-07-18T19:22:06.770

For examplefun1 = Compile[{{value, _Real}, {count, _Integer}}, Apply[Times, value - Range[count]], RuntimeAttributes -> {Listable}]; fun2 = Compile[{{value, _Real}, {count, _Integer, 1}}, Apply[Times, value - Range[count]], Parallelization -> True]; fun3 = Apply[Times, #1 - Range[#2], 1] &; fun4 = Compile[{{value, _Real}, {count, _Integer, 1}}, Apply[Times, value - Range[count], 1], Parallelization -> True]; {fun1[5, Range[3]], fun2[5, Range[3]], fun3[5, Range[3]], fun4[5, Range[3]]} // MatrixForm – FalafelPita – 2017-07-18T19:23:07.240

The result of that is {{4.,12.,24.},{24.},{4,12,24},{4.,3.,2.}}. The first (sub-list) is what should happen. The second is what happens when prod1c is modified to accept a vector, the 3rd is a non-compiled fix by giving another argument to Apply, and the last is the result of the compiled fix. – FalafelPita – 2017-07-18T19:25:43.843

Note that CompiledFunctions with RuntimeAttributes -> Listable thread only up to depth one. So they are not really Listable in a strict sense. – Henrik Schumacher – 2017-07-18T19:32:44.900

@HenrikSchumacher well, looking again at the definition of fun there, your fun2 does something clearly different from fun1 etc. Just consider the fact that the Range there is effectively called as Range@Range@3. Does this modification does what you expect? fun5 = Compile[{{value, _Real}, {counts, _Integer, 1}}, Apply[Times, value - Range@# ] & /@ counts ]? – glS – 2017-07-19T09:35:04.923

@FalafelPita sorry, wrong hashtag in the previous comment – glS – 2017-07-19T13:38:04.840

@glS Yes, fun5 does what I expect. I don't quite understand why compile won't let me do that without explicit mapping over scalars, but that is different from my original question about inlining, so I'm going to ask it as a separate question. – FalafelPita – 2017-07-20T20:33:16.480

6

The option "InlineCompiledFunctions" is notoriously unreliable for reasons I cannot recall. You can get it working with With, though.

Clear[prod1c, dBeta12];
prod1c = Compile[{{value, _Real}, {count, _Integer}},
Apply[Times, value - Range[count]]
];

dBeta12 = With[{cf = prod1c},
Compile[{{k, _Integer}, {x, _Real}, {alpha, _Real}, {beta, _Real}},
Total[(-1)^#*Binomial[k - 1, #]*cf[alpha, k - 1 - #]*cf[beta, #]*
x^(alpha - k + #)*(1 - x)^(beta - # - 1) &[Range[0, k - 1]]
],
RuntimeAttributes -> {Listable},
Parallelization -> True,
CompilationOptions -> {
"InlineExternalDefinitions" -> True,
"InlineCompiledFunctions" -> True}
]
];


As Binomial is not a CompiledFunction, it cannot be inlined. But you may write your own compiled version...

Edit: In the meantime, I found the source of this idea here.

1I tried changing my code by just incorporating the With idea you shared, but still encountered an error, until I removed the RuntimeAttributes->{Listable} property of the first small function, prod1c. The error stated ""could not be inlined because its use requires threading with the Listable runtime attribute". So it seems inlining and runtime listable can be incompatible. – FalafelPita – 2017-07-18T19:00:31.010

Yes, you are right. I observed the same and thus deleted the RuntimeAttributes option. – Henrik Schumacher – 2017-07-18T19:30:58.273