Why x = x doesn't cause an infinite loop, but f[x_] := f[x] does?

24

9

If I execute:

In[1] := x = x
Out[1] = x

and then I evaluate the symbol x:

In[2] := x
Out[2] = x

it simply returns x itself. I don't understand why this doesn't result in an infinite loop. Given that x references itself after the assignment x = x, I think that evaluating x should result in an infinite loop (x is replaced by x, which is replaced x, and so on). What am I missing?

Contrast this with what happens with the assignment:

f[x_] := f[x]

Evaluating f[x] after this assignment results in an infinite loop:

In[5]:= f[x]

During evaluation of In[5]:= $IterationLimit::itlim: Iteration limit of 4096 exceeded.

Out[5]= Hold[f[x]]

Edit: Using x := x instead of x = x does not cause an infinite loop. Using x = Identity[x] does not cause an infinite loop either. But using x := Identity[x] as suggested by Jacob Akkerboom in the comments results in an infinite loop. Why?

becko

Posted 2014-01-07T13:38:23.027

Reputation: 10 287

1

related to this answer: http://mathematica.stackexchange.com/a/39936/534

– becko – 2014-01-07T13:39:39.343

1I guess Mathematica tries to be smart in some definitions with no patterns on the rhs and sometimes skips them after a couple of iterations in which the rhs didn't change upon evaluation. Let's see if someone digs in and shares a more exact mechanism as to when this happens and when it doesn't – Rojo – 2014-01-07T14:23:31.117

The general idea of my guess already would explain your examples. Identity[x] loops because it evaluates to something. f[x_]:=f[x] loops because it's rhs has to be built every time, it has a pattern – Rojo – 2014-01-07T14:29:34.380

@Rojo More precisely, there's a loop (in these examples) whenever ValueQ returns True... I think... – becko – 2014-01-07T14:34:27.620

@Rojo, interestingly p:f[x_] := p does not loop – Simon Woods – 2014-01-07T14:50:05.070

@SimonWoods and now I add an exception for patterns that name the lhs as a whole, just to hide from the fact that my guess is clearly not right – Rojo – 2014-01-07T14:52:49.097

@SimonWoods In your example, ValueQ[f[x]] returns False, so my guess (based on @Rojo's guess) that there's a loop whenever ValueQ returns True still applies. – becko – 2014-01-07T15:03:17.227

5The Standard Evaluation Procedure "continue[s] reevaluating results until it gets an expression which remains unchanged through the evaluation procedure." So x = x is applied only once (no change); x := Identity[x] leads to infinite recursion: Identity[Identity[…]] because the argument x is evaluated before Identity[x] and becomes Identity[x], in which x is evaluate before Identity[x], ad infinitum. (In other words, Jacob is basically right.) – Michael E2 – 2014-01-07T15:15:27.807

x := Identity[Unevaluated[x]]; x reaches iteration limit rather than recursion limit. That was really my point, thanks for reminding me :). – Jacob Akkerboom – 2014-01-07T15:28:52.637

1@MichaelE2 how would that explain f[x_]:=f[x] giving an infinite iteration? – Rojo – 2014-01-07T15:34:01.533

I would say that the difference is because the semantics of function calls supports recursion, but the semantics of atom evaluation does not. But does this explain the issue or just kick the issue down the road? Depends on how deep an explanation is being asked for, doesn't it? – m_goldberg – 2014-01-07T15:55:23.000

@m_goldberg I would like an explanation that starts from the evaluation mechanism of Mathematica. There should be a general principle (I hope) that establishes when there is infinite recursion and when there isn't. – becko – 2014-01-07T16:12:39.857

4@Rojo As you know f[x_] := f[x] has nothing to do with Global`x, so let's use y in f[y]. I think difference is in how pattern replacements are handled. When f[y] is evaluated, M evaluates the head f first, finds the downvalue, applies it. Now, to apply it, M needs to evaluate the rhs, f of the pattern x. After it does the substitution, the result has not been evaluated yet. So even though the result is again f[y], f[y] has to be evaluated once more. Clearly you want this to happen for normal functions. Here you get infinite recursion. – Michael E2 – 2014-01-07T16:33:52.947

@Michael E2, I came to the same conclusion a while ago, but I couldn't explain that ff[y] //. ff[x_] :> ff[x] does not give an infinite recursion. How would you explain that? If you really think that calling a function is just replacement by Own/Down/Up/Sub-values, this is challenging. – Jacob Akkerboom – 2014-01-07T17:24:55.087

2@JacobAkkerboom I think ReplaceRepeated examines the result of a replacement after evaluation is complete and continues until a fixed point is reached. After the first replacement, the expression is the same as the starting expression, so it stops. – Michael E2 – 2014-01-07T18:13:04.377

@Rojo I think that the answer of Michael covers this case as well (f[x_]:=f[x]), if we recall that such a definition is tail-recursive, as I was describing here (tail recursion has been already mentioned in this discussion by Jacob). In that discussion, I mentioned that tail-recusrive functions in Mathematica are affected by $IterationLimit rather than $RecursionLimit, because they rewrite complete expressions and don't grow the expression stack.

– Leonid Shifrin – 2014-01-07T23:31:42.967

@LeonidShifrin. There's a simpler side to this question, the Identity example. But I understand the "hard" part of it: in what cases do you get an infinite loop and when you don't, when the expression being replaced isn't changed at all? After replacement it is always reevaluated, and the evaluation sequence starts over. But it doesn't really start over, because, sometimes, after the second evaluation of the same unchanged thing, it knows not to try again, and sometimes it doesn't and iterate infinately – Rojo – 2014-01-08T00:15:49.483

@Rojo Re: general part - yes, I agree. Re: ClearAll[f];f[x_]:=f[2];f[x_]:=f[3]; - actually, this one doesn't look mysterious to me, the second definition simply replaces the first, as it should. – Leonid Shifrin – 2014-01-08T09:55:49.030

@LeonidShifrin Terrible example, sorry. I'll be deleting that comment. – Rojo – 2014-01-08T13:15:32.300

@LeonidShifrin Say f[x_] := f[2]; _f := 10. The second definition ends up matching, and from the docs I would expect it should always either match the first one or stop iterating because the result isn't changing. As happens with //. – Rojo – 2014-01-08T13:18:52.567

@Rojo _f:=10 is so ugly :P (temporary message) – Jacob Akkerboom – 2014-01-09T10:27:59.820

@Rojo the first one does match. It produces f[2] and on the second (maybe third depending on how you count) iteration the second one matches and yields 10. Evaluator won't apply the same rule on consecutive iterations if it keeps on yielding the same result, otherwise it won't be able to terminate. So for f[1] we have evaluation sequence f[1] -> f[2] -> f[2] (but result is discarded and rule is removed from consideration as same rule produced same result on consecutive iterations) -> 10. – Oleksandr R. – 2014-01-10T03:30:42.403

@OleksandrR. I agree. That's why I think that means reality is not quite as explicitly documented. It does not continue evaluating until the result doesn't change. It sometimes changes how the evaluation is done so that the result is changed when it otherwise wouldn't. And it's not about iterating versus not iterating, but about deactivating one definition at a time – Rojo – 2014-01-10T04:55:02.783

Answers

8

I am not sure there is a more factual answer to this kind of question than saying "it is what it is." Nevertheless it is more satisfying to have hypotheses for such things.

Daniel Lichtblau says in answer to my question: Mathematica execution-time bug: symbol names:

I can explain the optimization involved here in slightly more detail. First recall that Mathematica emulates "infinite evaluation", that is, expressions keep evaluating until they no longer change. This can be costly and hence requires careful short circuit optimizations to forestall it when possible.

A mechanism we use is a variant of hashing, that serves to indicate that symbols on which an expression might depend are unchanged and hence that expression is unchanged. It is here that collisions might occur, thus necessitating more work.

In a bad case, the Mathematica kernel might need to walk the entire expression in order to determine that it is unchanged. This walk can be as costly as reevaluation. An optimization, new to version 7 (noted above), is to record explicitly, for some types of expression, those symbols upon which it depends. Then the reevaluation check can be shortened by simply checking that none of these symbols has been changed since the last time the expression was evaluated.

The implementation details are a bit involved (and also a bit proprietary, though perhaps not so hard to guess). But that, in brief, is what is going on under the hood. Earlier versions sometimes did significant expression traversal just to discover that the expression needed no reevaluation. This can still happen, but it is a much more rare event now.

My hypothesis: one of these "short circuit optimizations" is to recognize certain definitions without side effects ... and halt the infinite evaluation. The specific optimizations are is not spelled out, and in fact are "also a bit proprietary." We are therefore left to observe behavior once again.

We can see that it is not a matter of the difference between Own Values and Down Values, and further that the mere existence of a pattern on the LHS does not prevent the halting:

f[___] := f[]
f[]                 (* no infinite recursion *)

I am still exploring this behavior in an attempt to form a more complete theory.

Mr.Wizard

Posted 2014-01-07T13:38:23.027

Reputation: 259 163

How about this as a short circuit: InternalComparePatterns[Hold[f[]], Hold[f[___]]]. If result isIdentity,EquivalenceorSpecificity, and no other definition exists, assume that rhs does not need further evaluation. [InternalComparePatterns](http://mathematica.stackexchange.com/a/7904/89)

– István Zachar – 2014-01-08T09:45:41.410

@István Interesting hypothesis. Do you intend to test it? – Mr.Wizard – 2014-01-08T09:47:03.960

Yes, I have some promising test results, but also have some issues with named patterns. For example, pattern-variable-replacement (local to the definition, as in Jacob's p:f1[x_] := p;) should happen before testing, otherwise result is Incomparable. – István Zachar – 2014-01-08T10:21:53.277

I'm not sure if I can really agree that these effects are just optimizations; rather it seems to be fundamental behavior of the evaluation process. I couldn't observe any difference between version 5.2 and version 9 for any of @JacobAkkerboom's long list of examples, for instance. – Oleksandr R. – 2014-01-09T20:56:36.977

@Oleksandr That's a good point. As you can see I have yet to update my answer because I don't have a better explanation. Nevertheless I presume the mechanisms are related in that infinite recursion must be prevented in either case. What is your hypothesis? – Mr.Wizard – 2014-01-10T01:47:11.137

I would expect that, as Mma walks the expression tree, it marks off which branches have been evaluated and avoids reevaluating those if visited again. Otherwise, it continues applying rules at each branch it visits until it notices that the branch is not changing; then it considers it evaluated. You can see some side effects of this with your example. Define several rules of the form p : f[___] /; (Print[1]; True) := p. f[] now gives printouts from each definition. But, if you define more than one rule like f[___] /; (Print[1]; True) := f[], a cycle is formed and printouts go on forever. – Oleksandr R. – 2014-01-10T02:52:21.943

Note x = y; y := x; x also iterates endlessly. Being more precise I should say that really it isn't so much that the expression doesn't change that causes it to be considered evaluated (as obviously injecting side-effects using Condition doesn't actually change the result, so it should stop sooner in this case), but that no more rules exist that could affect that branch of the tree. When a rule matches, the substitution is made and the evaluation cycle begins again, but with the rule matched in the previous iteration removed from consideration. Or something quite similar to that, anyway. – Oleksandr R. – 2014-01-10T03:09:07.077

Okay, so actually the rule is only removed from consideration if it was applied in the previous iteration and produced the same result on the present iteration as well. The evaluator in this case goes through the motions right up until it finds the result is identical to the one from the last iteration, then discards it and proceeds without the responsible rule to give another one a chance to match. – Oleksandr R. – 2014-01-10T03:37:05.990

5

Extended comment

First of all, I think there is no easy answer to this question.

Let me collect my examples in an answer, in order to provide some structure in them as well as not to flood the comments. Throughout the answer, the lines of text describing the code refer to the code below it.

You will find that the examples which reach $IterationLimit are much harder to understand. If there is nothing mysterious about x:={x} creating infinitely much work for the kernel, then there is also nothing mysterious about x:=Identity[x] causing infinitely much work.

Clear[f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, x, x2, x3, x4, x5]

The following reaches $RecursionLimit

x2:=Identity@x2;x2

No infiniteness

x=x;x
y //. y :> y

Infinite iteration

x3:=Identity[Unevaluated[x3]]; x3

No infiniteness

y//.HoldPattern[y]:>Identity[Unevaluated[y]]
a[y] //. HoldPattern[y] :> Identity[Unevaluated[y]]

Infinite iterations

Hold[y] //. HoldPattern[y] :> Identity[Unevaluated[y]]
SetAttributes[b, HoldFirst]; SetAttributes[c,HoldRest]
b[y] //. HoldPattern[y] :> Identity[Unevaluated[y]]
b[y, 1] //. HoldPattern[y] :> Identity[Unevaluated[y]]
c[1, y] //. HoldPattern[y] :> Identity[Unevaluated[y]]

No infiniteness

b[1, y] //. HoldPattern[y] :> Identity[Unevaluated[y]]
c[y, 1] //. HoldPattern[y] :> Identity[Unevaluated[y]]

No infiniteness

p : f1[x_] := p; f1[1]
p : f2[x_] /; True := p; f2[1]
p : f3[x_] := p /; True; f3[1]

Infinite iterations

p : f4[x_] := Identity@Unevaluated@p; f4[1]

f5[g_] := g[g]; f5[f5]
f6[x_] := f6[x]; f6[1]
(p : f7)[x_] := p[x]; f7[1]

No infiniteness

f8[_] := f8[1]; f8[1]
f9[] := f9[]; f9[]
f10[___] := f10[]; f10[]

Infinite iteration

Combining the previous definitions, we do get an infinite iteration.

f11[] := f11[];
f11[___] := f11[];
f11[]

Set vs SetDelayed

II

f[x_] = f[x]

Shortcut in action

We can do

p : f12[_] /; (x5 = True) := p
f12[x5]

which outputs

f17[x5]

and does set x5. So we see that x5 that after the replacement the expression is not properly evaluated.

Related, but not understandable: Unexpected behaviour of Unevaluated

Jacob Akkerboom

Posted 2014-01-07T13:38:23.027

Reputation: 11 718

@IstvánZachar well, not precisely that, but maybe something similar? I will try to give an example – Jacob Akkerboom – 2014-01-07T16:36:19.933

@MichaelE2 this is not entirely true, I also thought this for a long time, but rm-rf pointed out a difference. { x1 := {1}; x1[[1]] = 2; x1, x2 = {1}; x2[[1]] = 2; x2} – Jacob Akkerboom – 2014-01-07T18:22:58.957

@MichaelE2 I guess I meant x1 := Evaluate[{1}] – Jacob Akkerboom – 2014-01-07T18:45:53.913

@IstvánZachar Jacob popped my balloon a little, but (1;y) is CompoundExpression[1, y]. It leads to y being evaluated again just like y:=f[y], because the new expression is not the same as y. But for x:=x, evaluating x results in x, which is the same as x and doesn't lead to further evaluation. – Michael E2 – 2014-01-07T18:58:06.413

@MichaelE2 indeed. Perfectionism is a dangerous enemy :P – Jacob Akkerboom – 2014-01-07T19:00:32.830

0

When you define a function you are defining a rule. This is covered in Ch. 4.1 of Paul Wellin's book Programming with Mathematica. For example, when you define

f[x_] := x^3 

what this says is whenever f is given an argument, it should be replaced with that argument cubed. Here f[x_] is a pattern a that basically matches anything. Whatever goes in gets cubed.

So when f[x_] := f[x] is defined a rule is created that says: take anything and call f[anything] which says take anything and call f[anything] which says take anything and call f[anything] ... and so on. Maybe "call" is more appropriately "replace with" but the point should be clear.

On the other hand, x = x is just an assignment. This is covered in Ch. 2.2

brown.2179

Posted 2014-01-07T13:38:23.027

Reputation: 109

3This does not explain why x:=x;x does not cause infinite recursion while y:=(1;y);y does. – István Zachar – 2014-01-07T16:04:39.000

@IstvánZachar the definition you give for y is not a tail recursion. I made the same mistake with x:=Identity@x. An iterating analogue of your code would be y := Function[Null, #2, HoldAll][1, y] – Jacob Akkerboom – 2014-01-07T16:07:20.377

1@Jacob But is it important that the definition is tail-recursive or center-embedded? I mean: it must be the same lhs=?=rhs test used for both cases by Mathematica that allows for the simple non-recursive/non-iterative outcome. Nevertheless, this post does not answer the original question. – István Zachar – 2014-01-07T16:12:42.350

@JacobAkkerboom It makes no difference to me that the infinite loop is due to IterationLimit or to RecursionLimit. – becko – 2014-01-07T16:15:26.557

@becko ah well you will find that the examples which reach $IterationLimit are much harder to understand. If there is nothing mysterious about x:={x} creating infinitely much work for the kernel, then there is also nothing mysterious about x:=Identity[x] causing infinitely much work. – Jacob Akkerboom – 2014-01-07T16:27:41.643

1

@JacobAkkerboom I agree. I think that x:={x} and x:=Identity[x] are explained by @MichaelE2's comment above (referencing http://reference.wolfram.com/mathematica/tutorial/TheStandardEvaluationProcedure.html). Your point is that @MichaelE2's explanation works for all the examples that reach $RecursionLimit? If yes, you should add that to your answer (Extended comment)... It could be helpful to reach a complete answer.

– becko – 2014-01-07T16:32:45.770

8I find this explanation misleading. Both Set and SetDelayed create rules, so "just an assignment" is not really explaining anything, and moreover, is misleading, since it hints on a difference between Set and SetDelayed which is not really there. There are indeed differences in how Set and SetDelayed work, ranging from well-known ones (evaluation) to more subtle ones (e.g. availability of Part assignment, or memory used by the created rule, etc), but they are not what you claim they are, since rules are created in both cases. – Leonid Shifrin – 2014-01-07T18:48:24.370

I agree @LeonidShifrin and IstvanZachar. This doesn't answer my question. – becko – 2014-01-07T20:04:26.940

From the docs, tutorial/ImmediateAndDelayedDefinitions, it states: "lhs=rhs" rhs is intended to be the "final value" of lhs (e.g. f[x_]=1-x^2) while "lhs:=rhs" rhs gives a "command" or "program" to be executed whenever you ask for the value of lhs (e.g. f[x_]:=Expand[1-x^2]). Clearly f[x_] := f[x] creates a circular reference but x = x says that the final value of x is x. I don't think there is anything mysterious going on here. – brown.2179 – 2014-01-07T22:04:32.103

@brown.2179 But according to your logic, x:=x (or x[1]:=x[1] or x:=Identity@x) should be circular (and infinite), wouldn't it? But it is not. – István Zachar – 2014-01-07T22:52:03.113

It looks like you've got the "half-truth". The actions of Set and SetDelayed are not as different as you seem to think, except the obvious fact that Set evaluates the r.h.s. before creating the rule, while SetDelayed does not. As I mentioned before, there are more subtle differences, but both are used to create rules, and in particular, both can be used to create functions. You can try e.g. f[x] = f[x], with the same result, and also f[x] := f[x], also with the same result. – Leonid Shifrin – 2014-01-07T22:53:02.070

1@brown.2179 This seems to be one of those cases where the info from the books or docs may mean less than extensive personal experience with the language - you may notice that a number of comments in this discussion were made by people who have very extensive practical experience with and deep understanding of the language, and all of them found this issue non-trivial and worthy of discussion. – Leonid Shifrin – 2014-01-07T22:56:15.570

@Leonid You guys can make this as complicated as you'd like. But, at the end of the day, the behavior you're seeing is consistent with the intended rule-of-thumb functionality as documented. An interpreter as flexible as Mathematica's is likely to have a few contradictions here and there. This behavior is not some special deep insight. – brown.2179 – 2014-01-08T16:26:37.210

3

@brown.2179 "This behavior is not some special deep insight" - well, you never know. Such contradictions sometimes lead to new insights and / or techniques. For example, someone not familiar with Trott-Strzebonski in-place evaluation technique would probably think of such behavior as a contradiction, while it is a highly useful technique. Experienced users get used to trust their instincts rather than just follow the docs, so we need to understand the strange stuff - thus all this discussion.

– Leonid Shifrin – 2014-01-08T17:51:49.627