How to define a complicated function inside the body of Compile?

10

1

I want to compile a function in a way to keep its memory footprint down. In the example below, I am trying to compile a function f that makes three calls to bigNastyFunction. I do not want to define bigNastyFunction outside and use option "InlineExternalDefinitions" -> True because then three copies of that function will be inserted into the body of the compiled function verbatim, leading to excessive memory usage. Instead, my strategy is to define bigNastyFunction inside a Module which is inside Compile. My hope is that bigNastyFunction will be appropriately stored as a subroutine, and called by the body as needed.

SetSystemOptions["CompileOptions" -> "CompileReportExternal" -> True];

f = Compile[{{x, _Complex}},
  Module[{bigNastyFunction = Function[{y}, Sin[y]]},

   bigNastyFunction[x] + bigNastyFunction[x^2]^2 + 3 bigNastyFunction[x^3]

 ]
]

The example above doesn't work because Function is not one of the functions that can be compiled by Compile. What workaround is there to define a large (complicated) function inside the body of a Compile'd function?

QuantumDot

Posted 2016-02-06T21:26:26.643

Reputation: 18 597

Answers

5

I think you need "InlineCompiledFunctions" -> False:

f = With[{bigNastyFunction = 
    Compile[{{y, _Complex}}, Sin[y](*,CompilationTarget\[Rule]C*)]}, 
  Compile[{{x, _Complex}}, 
   bigNastyFunction[x] + bigNastyFunction[x^2]^2 + 3 bigNastyFunction[x^3], 
   CompilationOptions -> {"InlineCompiledFunctions" -> False}]]
……
1 C1 = CompiledFunctionCall[ Hold[CompiledFunction[{y}, Sin[y], -CompiledCode-]][ C0]]
2 C2 = Square[ C0]
……

Related post:

Is the CompiledFunctionCall WVM opcode efficient?

CompiledFunctionCall vs. LibraryFunction

xzczd

Posted 2016-02-06T21:26:26.643

Reputation: 44 878

I think this takes more memory and time than just injecting the code (according to ByteCount and RepeatedTiming). – Michael E2 – 2016-02-07T03:45:56.593

@MichaelE2 I guess it… depends on how the big nasty function is defined :D – xzczd – 2016-02-07T03:51:29.540

+1 to your answer and your links, I learned some things today. – faysou – 2016-02-07T09:30:31.743

If you define bigNastyFunction = Function[{y}, Sin[y]] instead of using Compile, then ByteCount[f] goes down from 11704 to 4560, the same as @MichaelE2 's first answer. – QuantumDot – 2016-02-07T11:51:51.190

Your method is always produces functions with about twice as big a ByteCount as mine (more than that when bigNastyFunction is small) because three copies of the compiled function are stored in f. For expressions with a LeafCount up to about 500, your method is up to 20% slower than mine; from 500 or so up to 100,000, they're about the same speed; and at a LeafCount of 180,000, yours was about 10% faster. (The expressions were random plus/times combinations of elementary functions.) – Michael E2 – 2016-02-07T13:39:25.903

@xzczd Were you referring to Joel Klein? It didn't make sense to me until I read his answer just now. --- BTW, if both are compiled to C, yours is always a little faster. I'm not sure why: Calling subroutines vs. inlined code should introduce extra, if negligible, overhead; perhaps the optimization is different. (Yours is still bigger. I don't think the size is that important, but you seemed to bring it up initially.) – Michael E2 – 2016-02-07T14:43:01.643

@MichaelE2 Yeah, actually the memory cost is the only thing in my mind, because I think that's what OP is concerned. However, after checking FullForm@f, I noticed my answer is incorrect, it also inserts 3 copies of the (compiled) function into the body! So far I can't think out a way to define a sub routine as OP desires… – xzczd – 2016-02-08T14:15:44.693

xzczd and @MichaelE2 thanks for trying. I'm beginning to think that this is an unfortunate limitation of Compile. I'll contact Wolfram Research about it. – QuantumDot – 2016-02-14T11:12:05.587

4

Here is one way:

With[{opts = SystemOptions[]},
 With[{bigNastyFunction = Function[{y}, Sin[y]]}, 
  Internal`WithLocalSettings[
   SetSystemOptions["CompileOptions" -> "CompileReportExternal" -> True],
   f = Compile[{{x, _Complex}}, 
     bigNastyFunction[x] + bigNastyFunction[x^2]^2 + 3 bigNastyFunction[x^3]],
   SetSystemOptions[opts]
   ]]]

Here's another way:

Module[{bigNastyFunction = Function[{y}, Sin[y]]},
 Block[{x},
  f = Compile @@ {{{x, _Complex}}, 
     bigNastyFunction[x] + bigNastyFunction[x^2]^2 + 3 bigNastyFunction[x^3]}
  ]]

To check:

Needs["CompiledFunctionTools`"]; 
CompilePrint@f

Michael E2

Posted 2016-02-06T21:26:26.643

Reputation: 190 928

I think this has little difference with define bigNastyFunction outside and use option "InlineExternalDefinitions" -> True, because then three copies of that function will be inserted into the body of the compiled function verbatim" – xzczd – 2016-02-07T03:04:41.630

@xzczd Depends on how the big nasty function is defined.... – Michael E2 – 2016-02-07T03:08:52.730

1

It seems there is no solution like c language. A workaround is based on list.

    bigNastyFunction =Compile[{{y, _Complex}}, Sin[y](*,CompilationTarget\[Rule]C*)]
    f = Compile[{{x, _Complex}}, 
    Block[{xl = {x, x^2, x^3}, coefList = {1, 1, 3},fx}, fx = 
    Table[bigNastyFunction[ii], {ii, xl}];fx[[2]] = fx[[2]]^2;
    fx.coefList], CompilationOptions -> {"InlineExternalDefinitions" -> True, 
    "InlineCompiledFunctions" -> False}];

The compiled code by CompilePrint[f] is

    1 argument
    8 Integer registers
    5 Complex registers
    4 Tensor registers
    Underflow checking off
    Overflow checking off
    Integer overflow checking on
    RuntimeAttributes -> {}

    C0 = A1
    I3 = 0
    I7 = 4
    I6 = 2
    T(I1)1 = {1, 1, 3}
    I2 = 1
    I0 = 3
    Result = C1

    1   C3 = Square[ C0]
    2   C1 = Power[ C0, I0]
    3   T(C1)2 = {C0, C3, C1}
    4   I5 = Length[ T(C1)2]
    5   I4 = I3
    6   T(C1)0 = Table[ I5]
    7   I1 = I3
    8   goto 12
    9   C3 = GetElement[ T(C1)2, I1]
    10  C4 = CompiledFunctionCall[ Hold[CompiledFunction[{y}, Sin[y],-CompiledCode-]] 
    [[C3]]
    11  Element[ T(C1)0, I4] = C4
    12  if[ ++ I1 <= I5] goto 9
    13  C1 = Part[ T(C1)0, I6]
    14  C4 = Square[ C1]
    15  Part[ T(C1)0, I6] = C4
    16  T(C1)3 = CoerceTensor[ I7, T(I1)1]]
    17  C1 = DotVV[ T(C1)0, T(C1)3, I7]]
    18  Return

hlren

Posted 2016-02-06T21:26:26.643

Reputation: 77