Why should I avoid the For loop in Mathematica?

97

58

Some people advise against the use of For loops in Mathematica. Why? Should I heed this advice? What is wrong with For? What should I use instead?

Szabolcs

Posted 2017-01-02T14:40:36.080

Reputation: 213 047

3This comes up so often, I wanted to have something to link to. As always on StackExchange, other answers are welcome. – Szabolcs – 2017-01-02T14:40:48.943

3As, still, someone not that fluent in Mathematica, and a longtime programmer (mostly C), this was the most useful question/answer I've read - and I have read a bunch of these! Thanks to both the questioner and the responder. – Jack Adams – 2017-01-03T21:45:13.300

3

Related (my own self Q&A): Alternatives to procedural loops and iterating over lists in Mathematica: "Explicit loops are often counterproductive in Mathematica, not only taking more keystrokes, but also more execution time. They are also, in my opinion, more prone to mistakes."

– Mr.Wizard – 2017-01-05T09:30:44.350

Answers

106

If you are new to Mathematica, and were directed to this post, first see if you can use Table to solve your problem.


I have often told people, especially beginners, to avoid using For in favour of Do. The following is my personal opinion on why using For is harmful when learning Mathematica. If you are a seasoned Mathematica user, you won't find much to learn here. My biggest argument against For is that it hinders learning by encouraging error-prone, hard to read, and slow code.

For mimics the syntax of the for loop of C-like languages. Many beginners coming from such languages will look for a "for loop" when they start using Mathematica. Unfortunately, For gives them lots of ways to shoot themselves in the foot, while providing virtually no benefits over alternatives such as Do. Settling on For also tends to delay beginners in discovering more Mathematica-like programming paradigms, such as list-based and functional programming (Table, Map, etc.)

I want to make it clear at the beginning that the following arguments are not about functional vs procedural programming. Functional programming is usually the better choice in Mathematica, but procedural programming is also clearly needed in many situations. I will simply argue that when we do need a procedural loop, For is nearly always the worst choice. Use Do or While instead.

Use Do instead of For

The typical use case of For is iterating over an integer range. Do will do the same thing better.

  • Do is more concise, thus both more readable and easier to write without mistakes. Compare the following:

    For[i=1, i <= n, i++, 
      f[i]; g[i]; h[i]
    ]
    
    Do[ f[i]; g[i]; h[i], {i, n} ]
    

    In For we need to use both commas (,) and semicolons (;) in a way that is almost, but not quite, the opposite of how they are used in C-like languages. This alone is a big source of beginner confusion and mistakes (possibly due to muscle memory). , and ; are visually similar so it is hard to spot the mistake.

  • For does not localize the iterator i. A safe For needs explicit localization:

    Module[{i},
      For[i=1, i <= n, i++, ...]
    ]
    

    A common mistake is to overwrite the value of a global i, possibly defined in an earlier input cell. At other times i is used as a symbolic variable elsewhere, and For will inconveniently assign a value to it.

    In Do, i is a local variable, so we do not need to worry about these things.

  • C-like languages typically use 0-based indexing. Mathematica uses 1-based indexing. for-loops are typically written to loop through 0..n-1 instead of 1..n, which is usually the more convenient range in Mathematica. Notice the differences between

    For[i=0, i < n, i++, ...]
    

    and

    For[i=1, i <= n, i++, ...]
    

    We must pay attention not only to the starting value of i, but also < vs <= in the second argument of For. Getting this wrong is a common mistake, and again it is hard to spot visually.

  • In C-like languages the for loop is often used to loop through the elements of an array. The literal translation to Mathematica looks like

    For[i=1, i <= n, i++,
      doSomething[array[[i]]]
    ]
    

    Do makes this much simpler and clearer:

    Do[doSomething[elem], {elem, array}]
    
  • Do makes it easy to use multiple iterators:

    Do[..., {i, n}, {j, m}]
    

    The same requires a nested For loop which doubles the readability problems.

Transitioning to more Mathematica-like paradigms

A common beginner-written program that we see here on StackExchange collects values in a loop like this:

list = {};
For[i=1, i <= n, ++i,
  list = Append[list, i^2]
]

This is of course not only complicated, but also slow ($O(n^2)$ complexity instead of $O(n)$). The better way is to use Table:

Table[i^2, {i, n}]

Table and Do have analogous syntaxes and their documentation pages reference each other. Starting out with Do makes the transition to Table natural. Moving from Table to Map and other typical functional or vectorized (Range[n]^2) constructs is then only a small step. Settling on For as "the standard looping construct" leaves beginners stuck with bad habits.

Another very common question on StackExchange is how to parallelize a For loop. There is no parallel for in Mathematica, but there is a ParallelDo and more importantly a ParallelTable. The answer is almost always: design the computation so that separate steps of the iteration do not access the same variable. In other words: just use Table.

More general versions of For

For is of course in some ways more flexible than Do. It can express a broader range of iteration schemes. If you need something like this, I suggest just using While instead.

When we see for, we usually expect either a simple iteration through an integer range or through an array. Doing something else, such as modifying the value of the iterator in the loop body is unexpected, therefore confusing. Using While signals that anything can happen in the loop body, so the readers of the code will watch out for such things.

When is For appropriate?

There are some cases when For is useful. The main example is translating code from other languages. It is convenient to be able to translate analogous for loops, and not have to think about what may be broken by immediately translating to a Do or a Table (e.g. does the loop modify the iterator in the body?). Once the translated code works fine, it can be rewritten gradually.

There are existing questions on this, which also discuss other cases:

Summary

The problem with For is that it hinders learning and makes it very easy for beginners to introduce mistakes into their code.

If you are new to Mathematica, my advice is to forget that For exists, at least for a while. You can always accomplish the very same things with Do and While—use them instead. Very often you will be able to replace Do with a Table or even a vectorized expressions. This will help you learn to write effective Mathematica code faster.

If you are unsure about a use of For, then ask yourself: do I see a reason why For is clearly better here than Do or While? If not, don't use it. If yes, you may have found one of the rare good use cases.

Szabolcs

Posted 2017-01-02T14:40:36.080

Reputation: 213 047

1Range[10]^2 is superior to Table[i^2, {i, 10}] – Feyre – 2017-01-02T14:58:43.730

5@Feyre It is, in several ways. And the jump from Table to vectorized computation is quite straightforward—another reason to use Do, which then leads to Table, which then leads to the paradigm you show. But there is only so much space in one post. I wrote this up so I can link to it when people ask why they are told to avoid For. – Szabolcs – 2017-01-02T15:05:18.217

@Feyre I added your example. – Szabolcs – 2017-01-02T15:08:08.643

@Szabolcs, what a great answer, lucid and clean+1 – bobbym – 2017-01-02T15:52:14.627

+1 Great post, thanks for taking the time to put it together. I've added a link to this in the "Avoiding procedural loops" answer in the giant pitfalls question. – Simon Woods – 2017-01-02T16:12:52.217

2

+1, but one nit. I've never understood how Table is considered more inline with a "functional" paradigm than For. There are obvious utilities with using Table/Do over For, but other languages have used similar forms for years, usually expressed as ForEach and c++ has gotten in on the act via the ranged for.

– rcollyer – 2017-01-02T16:37:03.100

2Avoid For, Do is better. Avoid Do, Table is better. Avoid Table, Range^2 is better. When is it good enough? – DepressedDaniel – 2017-01-02T18:45:54.617

3@DepressedDaniel when our computations become fast enough to solve the most basic problems in Ramsey Theory ;) let's start with $R(5,5)$ and work from there... – Brevan Ellefsen – 2017-01-02T19:05:33.137

2@rcollyer Table is effectively a map function, rather than a foreach, since it collects results - and of course map is the quintessential example of the functional paradigm. – 00dani – 2017-01-03T01:50:45.963

@00Dani Yes, Table can be implemented in terms of Map, but that is likely slower than simply constructing the list on the fly for most cases, e.g. Table[f[i], {i, 10}] would be Map[Block[{i = #}, f[i]]&, Range[10]] a two step operation. (Yes, there might be optimizations in there.) And, why is collecting the result automatically functional? Functional implies we can pass around functions as top-level objects which Array is more akin to. Table fits reasonably in functional and procedural programming. – rcollyer – 2017-01-03T04:02:39.357

@rcollyer You are right that Table may not be technically a functional construct, but it does have many of the same virtues and it fits better into the language. Typically each element of the table is computed independently, and without side effects. Typically, each evaluation done by Do does have a side effect (otherwise it would be pointless). This is what makes it possible to auto-compile and even auto-Parallelize Table relatively easily. – Szabolcs – 2017-01-03T11:27:46.307

4"For does not localize the iterator i" -- that alone for me is a sufficient argument to avoid For. All the rest are good points, but mostly about style / readability whereas this one actually breaks the functional pattern (I strive to have as few assignments as possible in general) and may lead to hard-to-find bugs (as opposed to simple typo's that Mathematica will tell you about). – CompuChip – 2017-01-03T13:17:16.263

@Szabolcs as a said, it's a nit, and not a large one. – rcollyer – 2017-01-03T14:37:22.270

I agree the principal reason one should default to functional programming in MMA is b/c that leverages the essence of its design. But I wonder if another reason procedural programming is problematic is that MMA doesn't offer some basic procedural functionality one finds in procedural languages. E.g., isn't the given example, list = {}; For[i=1, i <= n, ++i, list = Append[list, i^2]], O~n^2 because it's not an analog to procedural programming? Wouldn't a fairer MMA pseudocode analog to C, but one not avail. in MMA, be For[i = 1, i <= 10, ++i, list[i] = i^2], which would be O~n? – theorist – 2017-01-04T03:42:53.670

(continued). Yes, there is For[i = 1, i <= 10, ++i, list[[i]] = i^2], but that will only work if the list has already been initialized, so you can't use that code to initialize it. Another key issue that it's helpful to make explicit to beginners, to encourage them to avoid a procedural approach (except when necessary), is that MMA's procedural syntax isn't merely different from that of procedural languages (e.g., the commas vs. the semicolons), but in fact much more confusing to code (for nested loops of even moderate complexity) than that of procedural languages. – theorist – 2017-01-04T03:49:31.387

@theorist initialization in this case would be akin to allocating an array in another language, e.g. int list[5]; for c/c++, etc. This can be easily done by setting list = ConstantArray[0, imax] for the more complex operations. – rcollyer – 2017-01-05T14:31:10.060

@Szabolcs great post! Would it make sense to mention Nest, Fold families in your "Transitioning..." section? The process of designing a function for Nest immediately addresses a lot of issues and allows arbitrary iteration techniques (even when parallelization is not possible) This recent question demonstrates show messy a simple For can be and how easily it can be simplified with NestList

– BlacKow – 2017-01-05T14:38:29.423

@BlacKow My aim was really just to tell newcomers to avoid For, not to promote functional programming, or to show all possible alternatives. There are other posts for that, such as the one by Mr. Wizard. I think that For alone is causing too much harm, and if I can get people to at least look at the equivalent Do, then I am already happy that I started them down on a better path. The main reason why I wrote this up was so that I won't have to mention the same thing again and again in comments. I can just link here instead.

– Szabolcs – 2017-01-05T15:07:51.693

1I got an automatic flag on this post for excessive comments. It probably would be good to move the salient points into the answer. – Mr.Wizard – 2017-01-05T15:33:50.907

30

Illustration of the timings required to compute the squares i^2 from i=1 to i=10^n for n=1, 2, ..., 7 with the use of For, While, Do, Table, and Range.

for = Table[
  Module[{i},
   For[i = 1, i <= 10^n, i++, i^2] // AbsoluteTiming // First
   ]
  , {n, 1, 7}]

while = Table[
  Module[{i},
   i = 1; While[i <= 10^n, i^2; i++] // AbsoluteTiming // First
   ]
  , {n, 1, 7}]

do = Table[Do[i^2, {i, 10^n}] // AbsoluteTiming // First, {n, 1, 7}]

table = Table[
  Table[i^2, {i, 10^n}]; // AbsoluteTiming // First, {n, 1, 7}]

range = Table[Range[10^n]^2; // AbsoluteTiming // First, {n, 1, 7}]

(By the way, look how concise are the codes of Do, Table and Range compared to For and While.)

The timings for 10^7 squares (i.e., n=7):

Last /@ {for, while, do, table, range}

{7.32907, 8.23668, 2.44558, 0.132735, 0.036395}

And a plot (vertical axis in log-scale):

ListLogPlot[{for, while, do, table, range}, Frame -> True, 
 PlotRange -> All, Joined -> True, ImageSize -> 400, 
 FrameLabel -> {"n", "Log[AbsoluteTiming] (sec)"}, 
 PlotLabels -> {"For", "While", "Do", "Table", "Range"}]

enter image description here

Do is about $3\times$ faster than For/While; for this particular application, one could (and should) employ Table/Range, which are two orders of magnitude faster than For.

corey979

Posted 2017-01-02T14:40:36.080

Reputation: 22 814

2Not sure that's a fair comparison, because: (1) Do, for example, throws away everything calculated (returning Null) whereas Table, for example, actually returns the list of squares; and (2) calculating i^2 is relatively trivial and misleadingly magnifies the effect of the looping construct used. – murray – 2017-01-02T16:51:09.613

@murray I've made it a community wiki; feel free to improve. – corey979 – 2017-01-02T16:53:20.737

1@murray In general, you are right: one of the things the benchmark shows is the performance of the looping construct itself, as i^2 is so fast. In more typical uses it is the speed of looping that is negligible compared to the loop body. But there's more here: Table is fast because of auto-compilation. Auto-compilation is possible because of the functional nature of Table: each element of the table is computed independently of the others and usually without side effects. – Szabolcs – 2017-01-03T11:32:03.700

1@murray In contrast, each evaluation done by Do will have side effects, otherwise it would be pointless. Usually, that side effect is changing the same global variable (e.g. accumulating a sum), which makes the evaluations non-independent. Thus Do does not auto-compile. Similarly, Table is immediately parallelizable while Do isn't. My original answer wasn't simply about performance. There's much more to it: readability, manageability, etc. But I think it is true that using For can significantly hinder performance. Of course this applies to Mathematica only, not other languages. – Szabolcs – 2017-01-03T11:32:08.213

2+1 for actual science, even if a microbenchmark-y one – None – 2017-01-03T14:40:04.537

For most beginners, the reason to avoid For is simpler: For requires you to set up the behavior of index variables in detail, specifying the start point, increment size and continuation test. Do and Table in their most paradigmatic forms favor a simple loop whose value begins at 1 and ends with an explicit final value. – Ralph Dratman – 2017-09-06T20:52:21.787

11

The functional paradigm, exemplified by this code:

Map[(#^2) &, Range[10^7]] // AbsoluteTiming

will usually result in the fastest execution because it takes advantage of the architecture of the machine. Both the CPU and the memory are optimized for sequential access, so when you pass a function over a list of data to transform that data, the code stays in one place, taking advantage of locality (no code-cache misses), and the data is accessed as one continuous stream of bytes. The above line of code takes 0.281 seconds to complete on my computer, while the line below ran for well over an hour and only produced a list 1,190,218 elements long:

out = {};
For[i = 1, i <= 10^7, ++i, AppendTo[out, i^2]] // AbsoluteTiming

CElliott

Posted 2017-01-02T14:40:36.080

Reputation: 510

5The fastest way is in fact using array arithmetic (i.e. "vectorization"): Range[10^7]^2. Map needs to evaluate Mathematica code for each element, which is slower than handling the entire array with a low-level implementation. To be precise, Map will auto-compile #^2&, which gives a considerable speedup. But it is still not competitive with parallelized array arithmetic that makes use of SIMD instructions. – Szabolcs – 2017-04-05T16:59:08.873

2But generally you are correct. If this were not something that can be expressed with basic vector arithmetic, then Map would be a good way. It is only purely numerical operations on packed arrays that will benefit for vectorization. – Szabolcs – 2017-04-05T17:31:49.473

4

A problem that arose in a recent Q&A, which is solved by Outer (not yet mentioned here), was

  • How to iterate a function over the cartesian product of lists and store the values generated?

For the sake of illustration, take two lists:

a = {"a", "b", "c"};
b = {1, 2, 3, 4};

We want to apply a function f to each ordered pair of elements from a and b. A for-loop way would be as follows:

m = Length[a];
n = Length[b];
table = ConstantArray[0, {m, n}];
For[i = 1, i <= m, i++,
  For[j = 1, j <= n, j++,
   table[[i, j]] = f[a[[i]], b[[j]]]
   ]
  ];
table
(* see output below *)

The following does the same:

table = Outer[f, a, b]
(*
  {{f["a", 1], f["a", 2], f["a", 3], f["a", 4]},
   {f["b", 1], f["b", 2], f["b", 3], f["b", 4]},
   {f["c", 1], f["c", 2], f["c", 3], f["c", 4]}}
*)

For a product three lists, use

Outer[f, a, b, c]

And so on.

The same can be done with Table:

table = Table[f[ai, bj, {ai, a}, {bj, b}] (* less efficient than Outer *)
table = Table[f[a[[i]], b[[j]], {i, m}, {j, n}] (* much less efficient *)

Like Outer, Table can be extended to higher dimensional products.

Michael E2

Posted 2017-01-02T14:40:36.080

Reputation: 190 928