What technical obstacles prevent all Mathematica code compiling to C



The following represents an attempt at a very simple view of the levels in Mathematica code:

enter image description here

We have had lots of questions about deployment and compiling Mathematica code to C. Some of these for speed, others for deployment. My interests go to deployment.

While some have mentioned in various comments that compiling "symbolic code" presents a "hard" problem or that lots of Mathematica has lots of complicated things that make it difficult to compile, I don't have a clear grasp of what really stands in the way of Wolfram doing this.

At some point all code that runs on a computer gets expressed at a machine code level. The C code to which some Mathematica code can compile clearly sits above this.

The answers to How to specify Mathematica as a programming language? provide some context for thinking about all of this.

I appreciate that Wolfram might have other business priorities. Those don't concern me in this question.

To my mind the design of a programming language/tool as compiled or interpreted seems more like a business/marketing decision rather than a hard choice that completely locks a language into one or the other. Why should such a design choice limit a universal machine?

I have wondered if everyone would regard such a question as this as out of scope here, but I do think someone could provide a specific answer and an answer would go a long way to understanding what we can do, what we could do, and perhaps what Mathematica will never enable us to do.

So, simply and directly, I'd like to know what technical obstacles prevent all Mathematica code from compiling to C (byte-code would do too ;-)?

enter image description here


Posted 2013-04-19T15:16:27.660

Reputation: 12 373

@Szabolcs so Compile[..,CompilationTarget->"C"] is also just bytecode? Much slower than real C? – matheorem – 2016-01-01T03:25:46.230

4I think this is the same question as why we write in C and not in Assembly or better yet in Machine code. Can be done? Yes you answered that already but is it worth it? For Numeric perhaps it is, fast prototyping in Mathematica and then implementation in C. But what about purely symbolic tasks? DSolve, for example, if I recall correctly is about 10000 pages of mostly Mathematica code. Can you imagine how many people and hours must be invested in order to make it in C? And what will be the gain of this endeavor? – Spawn1701D – 2013-04-19T15:32:47.983

1I think the real question is: why can't Mathematica run faster? Byte code (what Compile produces by default) is really just a low level interpreted language. Mathematica is a high level interpreted language. Mathematica is typically "slow". Compile's byte code is typically "fast". Your question is: why can't Mathematica run faster? Or what can't Mathematica be translated into something that will run faster? Am I correct? Or is the question only about deployment (not speed)? – Szabolcs – 2013-04-19T16:47:44.300

Another way to think of this is, why can't more of the stuff be compiled so that Mathematica is a great alternative for purely numerical stuff. For instance, all the statistical functionality could be compilable, with proper numerical checks to make it more useful. The intersection of Mathematica and a numeric library such as GSL (in terms of functionality) could be compilable, to make Mathematica a viable alternative for purely numerical tasks. – asim – 2013-04-19T17:26:22.757

@Szabolcs I think that the idea behind this question is more to create algorithms in a high level language such as Mathematica ( wolfram language at some time ) and produce performant machine code. Which is an approach to a good and fruitful discussion. – Stefan – 2013-04-19T17:26:54.793

@Szabolcs -- Despite a keen interest in things like parallel processing, over the past few years, my work has moved more to finding analytic solutions (geometric, matrix, decompositions). One can't do this with everything (or at least I haven't figured out how to do so), but in their specific applications these approaches run very fast so I have less of a need for brute speed than in the past. My main interest now revolves around deployment (we've had some of these conversations before). Still, wouldn't compiled code get you both? – Jagra – 2013-04-19T17:31:14.350

For instance my rampage for a non-2's complement BitNot. For low level programming this is really strange. High level shouldn't be an obstacle to the real world but an opportunity do be more expressive. When we implemented perl's new regex engine back in times, no one said that the addition of that higher level resulted in slow code. A good example that high level isn't always equals slower performance. – Stefan – 2013-04-19T17:35:42.930

@Stefan -- I found the most interesting thing about .NET when it appeared was that it allowed a programmer to use almost any language, but it essentially all compiled down to the same code. Much more complicated with Mathematica and all its extensions, but gee, why not? – Jagra – 2013-04-19T17:45:44.380

1@Jagra The main problem with deployment is not compiling the problem. It's bundling all the needed functionality. With Mathematica, this practically means that you need to bundle the complete system (this is what the CDF player does), or at least the complete kernel because it doesn't allow separating and pulling out one piece of functionality and bundling only that. If you'd like to deploy something that uses Integrate, it would need to bundle the implementation of Integrate and everything it depends on. I think this is something that's not solved by WRI (at least yet). – Szabolcs – 2013-04-19T19:12:56.483

@Jagra They seem to be pushing the approach where you bundle the complete kernel instead and don't try to separate some pieces (i.e. just use the CDF Player or the pro version of that---I don't have the opportunity to try the pro/enterprise version) – Szabolcs – 2013-04-19T19:13:47.783

@Szabolcs -- Agreed, but pushing the various players seems more a "business" strategy rather than not doing something because of the difficulty. They seem to fear that it will hurt their franchise if applications built in Mathematica don't have to run on some WRI platform. – Jagra – 2013-04-19T19:47:40.503

7I think bundling the complete kernel is pretty much unavoidable if the whole language is to be supported. How would a compiler determine what functionality to include if the code was something like ToExpression@FromCharacterCode@{50, 43, 50}? Or even something as simple as <<file. It think with a language like Mathematica that makes no real distinction between code and data, the difference between compiled and interpreted is more than just a marketing decision. – Simon Woods – 2013-04-19T21:41:38.463

3@Stefan Re BitNot design, it would be an understatement to say I lost that battle. I got clobbered. Looking through my mail on the topic at the time, lo these 15 years later, is still distressing. – Daniel Lichtblau – 2013-04-26T14:46:09.760



The key to this is what exactly you mean by "compile to C code". If the question is: is it possible to generate a collection of code conforming to the C standard that when compiled and run produces the same result as some general Mathematica function, then the answer is that there is no technical limitation. However this C code will be very large and will contain within itself a copy of Mathematica kernel.

Obviously such a compilation is not be too useful:

  • It would not be any more efficient than running a stock Mathematica Kernel
  • It will not present normal C-type interfaces.

A fully general Mathematica function can not be translated to a normal "C"-API standalone function simply because the meaning of the function is not fully determined until it is actually executed. In other words, the mapping from the function definition to the processor instruction to be executed depends on the Mathematica environment and on the precise parameters passed to the function, which precludes translation to a normal C-API function.

Bojan Nikolic

Posted 2013-04-19T15:16:27.660

Reputation: 166

Such a compilation would be very useful, even with the entire kernel "attached" to it. It would allow a different use of Mathematica than the current ones. AFAIK, there is no runtime available as happens for MATLAB. If you want to call some user defined functions from a different environment, either you need the full Mathematica version installed, or you need to negotiate your special case with WR. I whished that our programs/functions could be more integrated into other environments. – P. Fonseca – 2013-04-26T11:53:58.287

I don't know why a specific compilation needs to compile the entire kernel. By all accounts,WRI has highly modularized the kernel. It would seem the "kernel" includes a wide range of components that C or other languages view as libraries. What I want, wouldn't need to compile all of these anymore than one needs to distribute every existing C library with every application. I don't suggest WRI or anyone could do this easily or that it wouldn't require restructuring the kernel's code. Also, for me I have more concern with distribution of stand alone applications & components than efficiency. – Jagra – 2013-04-26T13:00:13.010