Nested Table + GetElement in C-compiled function crashes kernel

14

2

Bug introduced in 8.0. [CASE:4075198]


Using a nested Table and then Compile`GetElement crashes my kernel when I run a function compiled to C. Here's a MWE:

wowC =
 Compile[
  {},
  Module[
   {mat},
   mat =
    Table[
     Table[1, {j, 5}],
     {i, 5}
     ];
   Compile`GetElement[mat, 1, 1]
   ],
  CompilationTarget -> "C"
  ]
wowC[]

Calling that will crash your kernel so do it in one you don't care about.

Note that this isn't an issue without CompilationTarget -> "C" or the nested Table or using GetElement.

Anyone know why this is?

b3m2a1

Posted 2018-06-04T05:54:03.990

Reputation: 42 610

Answers

9

This really gives me headaches. I am not completely sure, but I guess that it is about a badly initialized pointer. In this case, this would be a bug (imho).

So, this is my current explanation for the behavior:

Here is the initialization code for our library function behind wowC. The important part is I0_8 = (mint) -1;

DLLEXPORT int Initialize_m00000849811(WolframLibraryData libData)
{
if( initialize)
{
funStructCompile = libData->compileLibraryFunctions;
I0_6 = (mint) 0;
I0_0 = (mint) 5;
I0_8 = (mint) -1;
I0_2 = (mint) 1;
initialize = 0;
}
return 0;
}

In the beginning of the main function, we find

I0_1 = I0_0;
I0_5 = I0_8;
dims[0] = I0_1;
dims[1] = I0_5;
err = funStructCompile->MTensor_allocate(T0_1, 2, 2, dims);
if( err)
{
goto error_label;
}
P0 = MTensor_getIntegerDataMacro(*T0_1);
D0 = MTensor_getDimensionsMacro(*T0_1);

So, T0_1 is initialized as tensor with dimensions $5 \times (-1)$! For a usual MTensor, P0 = MTensor_getIntegerDataMacro(*T0_1); sets the pointer P0 onto the first position of the data field of T0_1 (MTensors are actually flat (1-dimensional) lists of values along with information about the dimensions of the tensor, and maybe some pointers, e.g. to first elements of each row or higher dimensional slice.) Since the data field is empty, P0 will be quite likely point to NULL or some other memory position that we don't know. Afterwards, T0_1 gets extended by more and more rows by several calls of the form

err = funStructCompile->MTensor_insertMTensor(*T0_1, *T0_2, &I0_5);

But the pointer P0 gets never updated. So in the end, when the result is about to be retrieved with

{
mint S0 = I0_2 - 1;
S0 = S0 * D0[1] + (I0_2 - 1);
I0_5 = P0[S0];
}
*Res = I0_5;

this has to cause issues, either due to illegal memory access or due to dereferencing a NULL pointer.

Man, I am happy that I don't have to work with C/C++ on an every day basis!

Henrik Schumacher

Posted 2018-06-04T05:54:03.990

Reputation: 85 430

I'm impressed. I was looking at the C code but I didn't have the time to properly deconstruct it. Any suggestions on how to work around this in general? (Other than by the obvious using [[ . ]] with these nested Table expressions) – b3m2a1 – 2018-06-04T08:02:53.503

I would use mat = Table[Table[1, {j, 5}], {i, 5}]; for initialization. This has the advantage that it also avoids these calls to funStructCompile->MTensor_insertMTensor which should involve a copy operation. Using Compile`GetElement[Compile`GetElement[mat, 1], 1] does also work, though. – Henrik Schumacher – 2018-06-04T08:09:05.537

Unfortunately I'm using the nested Table in a much more complicated case where I can't simply do that. How will the second perform relative to [[ . ]]? – b3m2a1 – 2018-06-04T08:11:07.307

That's a good question! I am afraid that Compile`GetElement[mat, 1] will copy the whole first row so that it might actually be slower than using Part. But I'm not sure. I guess you have to give both a try. I am also curious how it performs but I have no time to test it at the moment. – Henrik Schumacher – 2018-06-04T08:15:03.753

2Man, I am happy that I don't have to work with C/C++ on an every day basis! I hope you do realise that C code like this is not at all representative of C code that people actually maintain and like to work with. Working with C/C++ on a daily basis can be a lot of fun, but working with automatically generated code on a daily basis (in any language that is not assembly language from a good optimising compiler) is hell. – tomsmeding – 2018-06-04T10:37:15.180

1@tomsmeding You have a point. Or let's say, a half one. ;) – Henrik Schumacher – 2018-06-04T10:48:17.307