Performance difference between for and while loop in Python

Question

Python's documentation suggests that the for statement is actually syntactic sugar that hides away the complexity of the concept of iterators and iterables. If this is true, that means that the following two functions are identical:

def for_loop(seq):
    for i in seq:
        i

and

def while_loop(seq):
    iseq = iter(seq)
    _loop = True
    while _loop:
        try:
            i = next(iseq)
        except StopIteration:
            _loop = False
        else:
            i

Notice that I'm keeping the body of the loop as simple as possible in order to focus on the performance of the for statement, therefore I'm avoiding calling print (or similar functions).

Here are the results after measuring the performance of these functions in IPython:

In [43]: %timeit for_loop(range(1000))                                                                                                                                
22.9 µs ± 356 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [44]: %timeit while_loop(range(1000))                                                                                                                              
49.9 µs ± 825 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [45]: %timeit for_loop(range(100000))                                                                                                                              
2.63 ms ± 43.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [46]: %timeit while_loop(range(100000))                                                                                                                            
5.16 ms ± 69.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

The for statement is actually twice as fast as the while loop (a somewhat smaller difference of 1.6 is observed when passing in a long list instead of a long iterator). The performance difference is constant for range of values of len(seq). I also observe that there are differences in bytecode of these functions when I disassembled them using the dis module.

To conclude: Python's documentation states that when using the for statement Python actually runs it as cover for while_loop. Can someone Pythoneer address the question of performance difference and particularity what is the source of it (CPython optimization, ...)?

Python loops/iterators interpreted by CPython are very slow. If you want a fast code, you should avoid them (eg. using packages like numpy, pandas, etc.), use AOT compilers or JIT (numba, PyPy, Cython, etc.), or implement the critical part of your algorithm for example in C or C++. Here, you do not measure the performance of Python but the one of the CPython interpreter. You could get completely different results with the PyPy JIT interpreter. Moreover, I think the pythonic way to write code is to not care about such micro-optimization and write clean, simple and well designed code. — Jérôme Richard
– Jérôme Richard, Commented May 18, 2020 at 10:39
Another similar example is that for i in itertools.count(): if i == n: break is faster than a conventional while loop: i = 0 while True: if i == n: break i += 1. Sorry for the bad formatting, but hopefully it is obvious. — user6276743
– user6276743, Commented May 18, 2020 at 10:46

Roy2012 · Accepted Answer · 2020-05-19 16:27:58Z

1

A couple of notes:

The fact that for is, from a functional perspective just syntactic sugaring for while, just mean that it has the same implementation under the hood. Different implementations may have very different performance.
If you look at the cpython implementation, you'll notice that indeed, for and while have different implementations. See for example the functions compiler_while and compiler_for in https://github.com/python/cpython/blob/ee40e4b8563e6e1bc2bfb267da5ffc9a2293318d/Python/compile.c
I suspect that for has some sort of optimization for ranges. When instead of timing your two functions with range(1000000) I timed them with np.random.rand(1000000) the gap went down from 3x on my laptop (24 ms vs 73 ms) to 50% (101 ms vs. 155 ms).

answered May 19, 2020 at 16:27

Roy2012

12.7k3 gold badges28 silver badges38 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Performance difference between for and while loop in Python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related