Is Mathematica's lexical scoping broken?

14

5

Why do these two lines output different things (b and 0, respectively)? Is it a bug?

ReleaseHold@With[{a=0},FlattenAt[Map[List,FlattenAt[Hold@With[{b=0,b=b},b+0a],{1,1}],{2}],{1,-1}]]
ReleaseHold@With[{a=0},FlattenAt[Map[List,FlattenAt[Hold@With[{b=0,b=b},b+0b],{1,1}],{2}],{1,-1}]]

(Sorry, I couldn't find a shorter example that didn't give an error.)

user541686

Posted 2016-03-08T06:52:27.113

Reputation: 1 257

5

This is not a bug, see for instance tutorial/LocalConstants (in particular the last line).

– None – 2016-03-08T13:20:25.433

1

The fact that renaming can occur is documented e.g. in Details section of With documentation: "With constructs can be nested in any way, with inner variables being renamed if necessary.", there are examples of it in Properties & Relations and Possible Issues sections, but it's not clearly stated when renaming is "necessary".

– jkuczm – 2016-03-08T13:53:41.757

1

Related: Enforcing correct variable bindings and avoiding renamings for conflicting variables in nested scoping constructs. @Mehrdad If you think that simulating scoping by means of variable renaming is debatable design decision then you're not alone.

– jkuczm – 2016-03-08T13:54:48.770

1

Please don't continue discussions here, there are already two chats for this topic: jkuczm-and-mehrdad chat and the newest one: on-answer-by-daniel-lichtblau

– Kuba – 2019-08-21T06:22:23.230

Answers

9

Just wanted to post back 3 years later to dissect what's happening.

tl;dr: It's a broken (sorta-lexical, sorta-dynamic, sorta-syntactic) substitution. If you really want lexical scoping, Module may be closer to what you want, although it's not clear to me that even Module (or anything else) could possibly do it perfectly.

To see what's happening, here's the same code, echoing every step:

(Edit: Renamed the second b to c to avoid a duplicate variable as suggested in the comments.)

In[1]:= Module[{x}, 
 With[{a = 0},
  x = Echo[Hold[With[{b = 0, c = b}, c + 0 a]]]; 
  x = Echo[FlattenAt[x, {1, 1}]];
  x = Echo[Map[List, x, {2}]]; 
  x = Echo[FlattenAt[x, {1, -1}]];
  ReleaseHold[x]]]

>> Hold[With[{b$=0,c$=b},c$+0 0]]

>> Hold[With[b$=0,c$=b,c$+0 0]]

>> Hold[With[{b$=0},{c$=b},{c$+0 0}]]

>> Hold[With[{b$=0},{c$=b},c$+0 0]]

Out[1]= b

Here's what happens if we change 0 a to 0 b (or in fact, 0 followed by any other variable):

In[2]:= Module[{x}, 
 With[{a = 0},
  x = Echo[Hold[With[{b = 0, c = b}, c + 0 b]]]; 
  x = Echo[FlattenAt[x, {1, 1}]];
  x = Echo[Map[List, x, {2}]]; 
  x = Echo[FlattenAt[x, {1, -1}]];
 ReleaseHold[x]]]

>> Hold[With[{b=0,c=b},0 b+c]]

>> Hold[With[b=0,c=b,0 b+c]]

>> Hold[With[{b=0},{c=b},{0 b+c}]]

>> Hold[With[{b=0},{c=b},0 b+c]]

Out[2]= 0

What seems to be happening here is that Mathematica only attempts to figure out the scope of a variable once it is forced to actually manipulate the expression containing it semantically. In this case, that means the semantic substitution of a=0 in the inner expression is triggering variable renaming, which is how Mathematica resolves naming conflicts. If that doesn't ever happen, then the engine doesn't bother to resolve name conflicts at all.

So after the ReleaseHold, what we're left with are these two expressions to evaluate:

With[{b$ = 0}, {c$ = b}, c$]
With[{b  = 0}, {c  = b}, c ]

This is actually a "sequential With" that is highlighted poorly, but equivalent to the following:

With[{b$ = 0}, With[{c$ = b}, c$]]
With[{b  = 0}, With[{c  = b}, c ]]

In the first case, the outer c$ = 0 has no effect on the innermost expression, since it is hidden ("shadowed") by the inner one. In the second case, this is also true, but the 0 value of b is substituted into c before b is hidden. Thus in the first case we're left with the unbound symbol b, whereas in the second case it simplifies down to 0.

The thing is, this is wildly broken behavior, because lexical scope is a static property, whereas Mathematica triggers it dynamically. Note that whether Mathematica decides to resolve naming conflicts via renaming or via tracking them internally (like most other languages) isn't relevant to this—they're just implementation details. The important thing is that expressions maintain their semantics, and that requires that a variable like c refers to the same thing regardless of whether it is later added to 0 a or 0 b.

Then why does Mathematica do this?

Because the language specification is self-contradictory. Specifically (not sure if pun intended):

  • Specification #1::

    With is a scoping construct that implements read-only lexical variables.
    With replaces symbols in expr only when they do not occur as local variables inside scoping constructs.

  • Specification #2::

    With[{x=x0,y=y0,…},expr] specifies that all occurrences of the symbols x, y, … in expr should be replaced by x0, y0, ….

At first glance these seem fine, but in fact these two requirements point to a deep inconsistency in how Mathematica treats expressions, stemming from a lack of a clear separation between static and dynamic properties of code.

Namely, the first specification is a semantic requirement, and thus requires static semantic analysis of expr. For Mathematica to implement "lexical scope" and understand that it should not replace a nested local variable, it needs to be able to identify each With, Module, etc. construct in the first place. However, this is impossible inside constructs like Hold, because Mathematica is so dynamic that it allows you to construct expressions at runtime. For example, compare the following two statements:

    ReleaseHold[With[{x = 0}, Hold[With [{x = 1}, x]]]] (* OK?!?! *) 
    With2 = With;
    ReleaseHold[With[{x = 0}, Hold[With2[{x = 1}, x]]]] (* error! *)  

You'd expect them to be semantically equivalent, except that it is impossible for Mathematica to know whether a given expression is a With construct without actually evaluating it. (A With might be something else initially, like Symbol["With"], or, conversely, the user might have wanted to later replace a With with something else.)

The second specification acknowledges as much. It admits that in fact the language is performing a symbolic (syntactic) substitution, not a semantic one. And that implies, by its definition, that With does not implement lexical scoping.

Instead, what With really does is that it fakes a semantic pre-analysis: it pretends that the language isn't dynamic at all, assumes that any With expression will look like With[...] (rather than, say, With2[...] above), and then performs the substitution hoping that you don't notice any discrepancy.

Admittedly, in practice, it works fine, because people rarely write expressions whose meanings are extremely dynamic, and thus preserve the static meanings of symbols—otherwise, it becomes harder for people to read such code, too. But it's a fundamentally conflicting language requirement to attempt to statically substitute inside what can be a completely dynamic expression, and there isn't really a true fix for it. The best they can do, therefore, is to fix the documentation and explain precisely what really happens (a half-baked mishmash of symbolic/static/dynamic substitution, with a trigger of variable renaming that can suddenly change the semantics of the inner expression unexpectedly!), as opposed to what they're trying to fake (lexical scoping).

user541686

Posted 2016-03-08T06:52:27.113

Reputation: 1 257

2I fail to see how this is anything other than undefined behavior.If there is a bug, it is that With did not give an error when a variable appeared twice in the lhs of the assignment list. – Daniel Lichtblau – 2019-08-18T13:38:12.223

1This variant might be a better example. In that inner With change the second variable. With[{b = 0, c = b}, c + 0 b] vs. With[{b = 0, c = b}, c + 0 a]` – Daniel Lichtblau – 2019-08-18T13:44:29.727

@DanielLichtblau: Thanks, I changed it to clarify. But I disagree that it's undefined behavior. There's a Hold outside which prevents its evaluation, and undefined behavior only occurs when a behavior is demanded of the expression -- that is, during evaluation. That makes it equivalent to Hold[With[{a=1,a=2},a]], which doesn't (and shouldn't) give an error. – user541686 – 2019-08-18T16:06:28.153

@Daniel: actually.... this is giving me second thoughts. Maybe that should also be an error? It's weird because Mathematica is both trying to analyze the expression and also avoid doing so. It might be outright impossible to give it reasonable semantics at all -- I'll write more when I have a chance (maybe in another 3 years? haha). – user541686 – 2019-08-18T16:32:28.400

With the change to use c instead of b it could well be a scoping bug, I'm not sure. Possibly the lexical scoping emulation should always rewrite. The business of using b both as lhs and rhs of With assignments makes it a bit hard for me to determine what to expect for when rewriting will or will not happen. – Daniel Lichtblau – 2019-08-19T02:47:00.400

@DanielLichtblau: I think I fleshed it out -- see my update. It seems to me that it's not really so much a bug as it is an attempt to satisfy an impossible requirement. – user541686 – 2019-08-19T02:54:37.333

I would replace the last three transformation steps with x = Echo[x /. HoldPattern@With[{first_, second_}, expr_] :> With[{first}, With[{second}, expr]]];. It's a lot clearer (at least in my opinion), shorter, does not rely on undocumented behavior, and still shows the same behavior – Lukas Lang – 2019-08-19T08:02:51.390

Also, I don't really see an issue with what's happening - Mathematica knows only one type of variables - global ones. All the different scoping constructs do is perform static analysis on their bodies, renaming and assigning variables as needed. This means that nested scoping constructs are also only recognized statically. If you now deliberately change the semantics of nested scoping constructs after the outer one is evaluated, it's clear that you're breaking things. Admittedly, it a bit strange that renaming is triggered in only one of both cases, but for well behaved cases you don't care. – Lukas Lang – 2019-08-19T08:11:29.873

@LukasLang: You genuinely "don't really see an issue" with the binding of c suddenly changing merely because 0a was added to it in lieu of 0b? Props to you I guess, because to me (and I dare say I would expect to most people) it looks like completely nonsensical behavior. – user541686 – 2019-08-19T08:19:25.340

@Mehrdad As I said, the trigger for renaming seems a bit unpredictable. What I meant is that you're only seeing these "nonsensical" effects because you're deliberately tricking the outer With by changing the semantics of its body after the fact. If you accept that Mathematica is an expression rewriting language with only global symbols, then I think the current implementation is the best thing you can do to implement scoping. The only thing you could do is make the weirdness more predictable by renaming only as needed, but that wouldn't remove the true issue with your examples I think – Lukas Lang – 2019-08-19T08:26:04.573

@LukasLang: The problem with "accepting that Mathematica only uses global symbols" is that *it's both counterintuitive AND in direct contradiction with their own documentation*. And it has absolutely nothing to do with me "tricking the outer With" or anything like that; that's utter nonsense. You don't even need With at all to see this problem. It can manifest itself a million other ways; all I did was just show you 1 example. Here's another example: **The following code is nondeterministic (!):** f = Function[{}, {x$1, x$2, ..., x$2000}]; Module[{x=1}, ContainsAny[f[] - x, {0}]] – user541686 – 2019-08-19T09:39:48.073

@LukasLang: (cont'd...) This is despite the documentation claiming that Module creates a local variable. Clearly it doesn't, but in what world should a user already know the documentation is lying??? "A bit unpredictable" is quite a disingenuous way to portray what's really active lying and complete nonsense in the language... – user541686 – 2019-08-19T09:48:17.533

Let us continue this discussion in chat.

– Lukas Lang – 2019-08-19T10:25:51.030

7

[Too long for a comment but not a full response.]

I am starting to see this differently. Sequential With assignment lists (which I realize are not yet documented) can behave differently from the flat counterpart, when a symbol appears both as an lhs and rhs. In particular these will behave differently.

With[{b = 0}, {c = b}, c + 0 b] // Trace // InputForm

(* Out[2713]//InputForm=
{HoldForm[With[{b = 0}, {c = b}, c + 0*b]], HoldForm[With[{c$ = 0}, c$ + 0*0]], 
 HoldForm[0 + 0*0], {HoldForm[0*0], HoldForm[0]}, HoldForm[0 + 0], HoldForm[0]} *)

I think there is no controversy here.

With[{b = 0, c = b}, c + 0 b] // Trace // InputForm

(* Out[2714]//InputForm=
{HoldForm[With[{b = 0, c = b}, c + 0*b]], HoldForm[b + 0*0], 
 {HoldForm[0*0], HoldForm[0]}, HoldForm[b + 0], HoldForm[b]} *)

This much is as designed: With assignments only apply to later arguments so the b in that c=b assignment does not get lexically replaced.

Using Hold as in the original examples delays some of the evaluation, in a way that I think is also not controversial. Since the original With argument list has been split at that point, I am inclined to see the eventual behavior as also being according to design.

Now for the renaming, I am again not sure anything is amiss. Here are two examples to show what I have in mind. In the first we see the inner With variables get renamed.

Trace[With[{a = 0}, x = With[{b = 0, c = b}, c + 0 a]]] // InputForm

(* Out[2723]//InputForm=
{HoldForm[With[{a = 0}, x = With[{b = 0, c = b}, c + 0*a]]], 
 HoldForm[x = With[{b$ = 0, c$ = b}, c$ + 0*0]], 
 {HoldForm[With[{b$ = 0, c$ = b}, c$ + 0*0]], HoldForm[b + 0*0], 
  {HoldForm[0*0], HoldForm[0]}, HoldForm[b + 0], HoldForm[b]}, HoldForm[x = b], 
 HoldForm[b]} *)

In this next case we do not rename With variables.

Trace[With[{a = 0}, x = With[{b = 0, c = b}, c + 0 b]]] // InputForm

(* Out[2724]//InputForm=
{HoldForm[With[{a = 0}, x = With[{b = 0, c = b}, c + 0*b]]], 
 HoldForm[x = With[{b = 0, c = b}, c + 0*b]], 
 {HoldForm[With[{b = 0, c = b}, c + 0*b]], HoldForm[b + 0*0], 
  {HoldForm[0*0], HoldForm[0]}, HoldForm[b + 0], HoldForm[b]}, HoldForm[x = b], 
 HoldForm[b]} *)

The salient difference between these and the held examples is that the held ones gave rise to a split in the With assignment lists, and that does allow for different semantics than the unsplit case. It happened in such a way that it was affected by whether or not renaming took place.

As I mentioned in a comment, I suppose one way to address this would be to force variable renaming to always happen. Not sure how popular such a change would be though.

Daniel Lichtblau

Posted 2016-03-08T06:52:27.113

Reputation: 52 368

Comments are not for extended discussion; this conversation has been moved to chat.

– Kuba – 2019-08-21T06:20:33.303