Why are numeric division and subtraction not handled better in Mathematica?

116

48

There is something that has been troubling me for a while. At least through version 10.0 the performance of a / b and a - b is not equivalent to, and significantly inferior to, Divide[a, b] and Subtract[a, b], despite the fact that these are treated as equivalent in the documentation. As a terse code fanatic it chafes me to have to write out the longer forms, but that is only the beginning of my concern, because the latter, faster forms are transparently converted into the former, slower forms.

Examples from the documentation for Divide:

x/y or Divide[x,y] is equivalent to x y^-1.

Divide[x,y] can be entered in StandardForm and InputForm as x ÷ y, x EscdivEsc y or x \[Divide] y.

Proof that these statements are false:

{a, b} = List @@ RandomReal[{-50, 50}, {2, 1*^7}];

Do[a/b, {50}]          // Timing // First
Do[a b^-1, {50}]       // Timing // First
Do[a\[Divide]b, {50}]  // Timing // First
Do[Divide[a, b], {50}] // Timing // First
2.262

2.231

2.184

0.437

Divide[a, b] is not evaluated in the same manner as the rest. Similarly for subtraction:

Do[a - b, {50}]            // Timing // First
Do[a + (-1 b), {50}]       // Timing // First
Do[Subtract[a, b], {50}]   // Timing // First
3.651

3.65

1.404

Although the examples above use packed arrays (see Why does list assignment with a packed array result in unpacked values? for an explanation of List @@) the same performance differential exists with unpacked lists.

One can see with Trace that the short forms induce additional operations:

a = Range[1, 2];
b = Range[3, 4];

a/b          // Trace
Divide[a, b] // Trace
{{a,{1,2}}, {{b,{3,4}}, 1/{3,4}, {1/3,1/4}}, {1,2} {1/3,1/4}, {1/3,1/4} {1,2}, 1/3,1/2}}

{{a,{1,2}}, {b,{3,4}}, {1,2}/{3,4}, {1/3,1/2}}
a - b          // Trace
Subtract[a, b] // Trace
{{a,{1,2}}, {{b,{3,4}}, -{3,4}, {-3,-4}}, {1,2} + {-3,-4}, {-3,-4} + {1,2}, {-2,-2}}

{{a,{1,2}}, {b,{3,4}}, {1,2} - {3,4}, {-2,-2}}

Note the false equivalence in this output; the fast Divide and Subtract operations are formatted as {1,2}/{3,4} and {1,2} - {3,4}, yet these slow forms are not part of their evaluation process.

More problematic is when this false equivalence changes evaluation. For example, if you try to work symbolically with Subtract and Divide in an evaluated expression and use the result these faster forms are replaced with the slower ones:

expr = Divide[x, y] + Subtract[x, y];
fn = Function[{x, y}, Evaluate @ expr]
fn // FullForm
Function[{x, y}, x + x/y - y]

Function[List[x,y], Plus[x, Times[x, Power[y,-1]], Times[-1,y]]]

Note also that Divide and Subtract are stripped when converting forms in the Front End (menu Cell > Convert To) so even without evaluation these operators may be lost.

Therefore I have these questions:

  • Why are these operations universally treated as equivalent when they are clearly not programmatically equivalent?

  • Why does Mathematica not include optimization for cases of numeric division and subtraction to eliminate the additional evaluation steps?

  • Is there a practical way to add global optimizations to automatically convert these operations to the fast forms before numeric evaluation?

  • Failing the above, what is the best way to work symbolically with actual division and subtraction operators for the sake of performance?

  • Is internally representing division and subtraction as multiplication and addition really the only mathematically valid design option? Couldn't these instead be first-class operators that are recognized as equivalent to Times and Plus for the purpose of pattern matching but not converted into Times and Plus (which introduces additional Times and Power operations)?

Mr.Wizard

Posted 2014-01-22T17:34:59.797

Reputation: 259 163

finally you're asking a question i was wondering constantly. maybe someone from WRI can shed some light on that dark matter, but i doubt it, as it is often the case if it is not covered by their marketing allowance... +1 – Stefan – 2014-01-22T17:41:58.263

@Stefan I've gotten that comment a number of times; I believe there are two factors: version 7 is faster in a number of simple cases, and Mathematica performance seems to correlate quite strongly to clock speed. I use an Intel i5-2500K at 4.6 GHz. It is not the newest chip (released January 2011) but I doubt the recent Haswell chips (e.g. i5-4670K) are significantly better for Mathematica as they don't run faster. – Mr.Wizard – 2014-01-22T17:59:35.807

1Something about the timing is interesting. On my PC which runs at 3.8 GHz (base is 3.2 GHz, Intel i7-3930K, MMA V. 9.0.1), The short form timing for subtraction is 4.59375 seconds whereas the long form is an astonishing 0.125 seconds (an order of magnitude faster than yours). – RunnyKine – 2014-01-22T18:07:04.610

@RunnyKine That's even more aggravating. Imagine all the numeric performance you may be losing due to the lack of proper optimizations. :-/ – Mr.Wizard – 2014-01-22T18:08:05.437

I agree. Thanks for this question, didn't realize such a huge difference existed between these forms – RunnyKine – 2014-01-22T18:09:22.183

Could it be you are timing parsing times? – Peltio – 2014-01-22T18:39:04.917

2@Peltio To whom are you speaking? I am fairly certain this is not a superficial difference as I attempted to show with Trace. – Mr.Wizard – 2014-01-22T18:41:11.887

I was speaking in general. Your trace seems to show that a-b is treated as a + (-b), hence the additional operations (despite the expressions be mathematically equivalent). But by parsing times I was thinking at the time needed to convert an infix operator into the call to the corresponding compiled function (which should be automatic in case you call it directly in the code). Mine was just a guess. – Peltio – 2014-01-22T18:56:29.053

To clarify: a/b require parsing time to be converted to Time[a,b^-1] (or Times[a,1/b], I don't remember) and this form requires more computations then Divide[a,b]. But the infix form of this latter procedure also require parsing time to be transformed into a call to Divide. When the operation is fast, parsing time could make the difference. Just guessing, eh... – Peltio – 2014-01-22T19:13:55.970

Nice question! This is true even for compiled expressions. – ybeltukov – 2014-01-22T19:24:10.097

1Something funny is going on. The results for power are all slower than the Divide. Do[a*b, {50}] // Timing // First Do[Times[a, b], {50}] // Timing // First Do[a b, {50}] // Timing // First. Also, while running this, more than one core is used. – Ajasja – 2014-01-22T20:18:26.300

@Ajasja Interesting! Each multiplication method appears to use a single core, while Divide[a, b] uses multiple cores. Is that the behavior you are seeing as well? – Mr.Wizard – 2014-01-22T20:33:10.383

No, that's not what I'm seeing. I can see activity on six cores for each of the variants, though my license allows only 4. Maybe hyperthreading has to do with that. Or windows is splitting up each of the kernels over more than one core. – Sjoerd C. de Vries – 2014-01-22T21:07:17.963

@Sjoerd I am running the code in a single Kernel. The parallelism is taking place in the Intel MKL I believe. How do your timing ratios compare to the ones in my question? – Mr.Wizard – 2014-01-22T21:13:37.457

@Mr.Wizard I see activity on ~3 of the four cores (the cpu usage is not 100%). My timings have similar ratios as your. Are you sure you're running this on only one core. (Oh I use MMA 9.01 and win7x64) – Ajasja – 2014-01-22T21:28:37.747

1Running on a somewhat elderly 8 core intel xeon machine (win7 / M9.0.1). I see two oddities. The a/b variant uses ~10Mb more memory at peak than the Divide[a,b] case. The a/b variant also engages only ~40% of the machine's aggregate compute capacity (in task manager). The Divide[a, b] peaks at 66%. This suggests something down in the library calls made. Would be interesting to see the same under linux. – Ymareth – 2014-01-22T21:28:42.227

I see similar results to Ymareth on an i5 processor (win7, MMA 9.0.1) - all methods use all 4 cores roughly equally but the multiplication versions run at about 40% CPU with Divide at about 70% – Simon Woods – 2014-01-22T21:37:15.763

I get 11.544074, 11.715675, 11.419273, 1.840812. Obviously, your computer is much faster than mine, but the timing ratios are similar. I can confirm Ymareth's and Simon's results about peak loads. – Sjoerd C. de Vries – 2014-01-22T21:48:03.090

And I'm running a single kernel too. Confused threads and kernels. – Sjoerd C. de Vries – 2014-01-22T21:55:47.003

I think it would help you a lot to fix the parsing, so the short forms are converted to Subtract and Divide. I actually do that (for Divide only and for a different purpuse) – Rojo – 2014-01-23T13:29:22.910

If there are symbols around, it will still end up unevaluated as slow form, but we don't have a SimplifyForPerformance function anyway (we should). I also would change the makeboxing of Power[_, _?Negative] so that it is visually evident when the slow forms are used – Rojo – 2014-01-23T13:31:44.567

1The Attributes of Power, Times, Plus, are more detailed than the Attributes of Subtract and Divide, which likely means that they carry more overhead in terms of checking. Using //FullForm to check how things are entered in the interpreter, Divide and Subtract seem to be used only if they are entered explicitly. When entered in the short forms the longer constructions seem to be taken by the interpreter. Clearly this could be improved. This difference can be seen with Divide[#1,#2]&//FullForm and #1/#2&//FullForm. – Bob R – 2014-02-06T02:35:40.243

@Mr.Wizard Is there a remedy for this efficiency problems? Perhaps a set of substituting rules that apply to FullForms? Could I start a bounty for a complete answer on your question ? - I think this should be fully answered. – tchronis – 2014-02-21T11:50:13.587

@tchronis I have no general solution. I asked this question hoping that someone else already had implemented one (or to learn that it was fixed in a later version), and to spread awareness of the issue. I have a few ideas that if successful would improve this for a number of cases, but many would be left out, and I think an incomplete fix is worse than none is some ways. Don't spend your reputation on a bounty; if this is still unanswered in a few weeks I'll either post an (incomplete) answer myself and/or start a bounty. – Mr.Wizard – 2014-02-21T22:33:45.537

Part of me is relieved it's slower to use non FullForm language. I'm new, but isn't FullForm what's sent to the calculator (the Kernal)? Yet there certainly seems to be a bug in the front end: It does seem crazy to translate x/y into Times[x, Power[y,-1] instead of into Divide[x,y]. By the way, when I wrap Divide[x,y] in Simplify it gives back x/y, which I also find goofy. It should give back Divide[x,y]. Maybe the front end should have an option for the FullForm command which if invoked would substitute faster FullForms like Divide[x,y] for equivalent slower ones like Times[x, Power[y,-1]! – GeofAndron – 2014-03-14T00:22:02.663

One solution might be to build into Mathca an (ever growing) list of functions which are alternatives and faster compared to other functions. Of course calling such an option might totally change the FullForm of our code. For example, I may not be aware that treating my data as matrices might vastly accelerate computation, but Mathca might "know" that... – GeofAndron – 2014-03-14T00:30:44.860

I don't see what the timings have to do with the original question. Surely when the documentation says the various forms of division are the same, it means that the output is the same. I would expect variation in what happens to convert each form into the form that actually gets executed to perform the division. – murray – 2014-03-21T19:30:52.970

@Mr.Wizard could the differences in implementation be for symbolic reasons? i.e., a - b becoming a + (- b) is the way it would be handled by Mathematicians symbolically / algebraically (and decreases the number of cases WRI needs to code in for a bunch of things). And so WRI performs that conversion so they can simplify how they handle things like Solve? – b3m2a1 – 2016-12-30T21:05:41.710

Answers

14

An extended comment. I'm not sure if this has been realized, please correct me if it has. The result of the Divide[a,b] operation is not the same as the first 3 which are identical.

{a, b} = List @@ RandomReal[{-50, 50}, {2, 1*^7}];

x1 = a/b;
x2 = a b^-1;
x3 = a/b;
x4 = Divide[a, b];

Now...

Tally[x1 - x2]
Tally[x2 - x3]

Both give 10^7 zeros.

Tally[x3 - x4]

Gives ~ 7,500,000 zeros and 2,500,000 slight differences.

Ymareth

Posted 2014-01-22T17:34:59.797

Reputation: 4 581

2x4 is also what I get from using the apache commons math library (3.2) ebeDivide method on RealVectors made from a and b using JLink and loading the results back to Mathematica. So Divide[a,b] in Mathematica == a / b in Java. – Ymareth – 2014-04-04T12:10:17.643

2

This has been observed, and it was discussed in (39200) and on MathGroup as linked by Daniel in the first comment below that question. Nevertheless it's an observation I failed to include in my question.

– Mr.Wizard – 2014-04-04T19:07:13.020