3

Python's documentation suggests that the for statement is actually syntactic sugar that hides away the complexity of the concept of iterators and iterables. If this is true, that means that the following two functions are identical:

def for_loop(seq):
    for i in seq:
        i

and

def while_loop(seq):
    iseq = iter(seq)
    _loop = True
    while _loop:
        try:
            i = next(iseq)
        except StopIteration:
            _loop = False
        else:
            i

Notice that I'm keeping the body of the loop as simple as possible in order to focus on the performance of the for statement, therefore I'm avoiding calling print (or similar functions).

Here are the results after measuring the performance of these functions in IPython:

In [43]: %timeit for_loop(range(1000))                                                                                                                                
22.9 µs ± 356 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [44]: %timeit while_loop(range(1000))                                                                                                                              
49.9 µs ± 825 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [45]: %timeit for_loop(range(100000))                                                                                                                              
2.63 ms ± 43.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [46]: %timeit while_loop(range(100000))                                                                                                                            
5.16 ms ± 69.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

The for statement is actually twice as fast as the while loop (a somewhat smaller difference of 1.6 is observed when passing in a long list instead of a long iterator). The performance difference is constant for range of values of len(seq). I also observe that there are differences in bytecode of these functions when I disassembled them using the dis module.

To conclude: Python's documentation states that when using the for statement Python actually runs it as cover for while_loop. Can someone Pythoneer address the question of performance difference and particularity what is the source of it (CPython optimization, ...)?

3
  • 2
    What you have implemented in Python is implemented in C. Commented May 18, 2020 at 10:30
  • 2
    Python loops/iterators interpreted by CPython are very slow. If you want a fast code, you should avoid them (eg. using packages like numpy, pandas, etc.), use AOT compilers or JIT (numba, PyPy, Cython, etc.), or implement the critical part of your algorithm for example in C or C++. Here, you do not measure the performance of Python but the one of the CPython interpreter. You could get completely different results with the PyPy JIT interpreter. Moreover, I think the pythonic way to write code is to not care about such micro-optimization and write clean, simple and well designed code. Commented May 18, 2020 at 10:39
  • Another similar example is that for i in itertools.count(): if i == n: break is faster than a conventional while loop: i = 0 while True: if i == n: break i += 1. Sorry for the bad formatting, but hopefully it is obvious. Commented May 18, 2020 at 10:46

1 Answer 1

1

A couple of notes:

  • The fact that for is, from a functional perspective just syntactic sugaring for while, just mean that it has the same implementation under the hood. Different implementations may have very different performance.
  • If you look at the cpython implementation, you'll notice that indeed, for and while have different implementations. See for example the functions compiler_while and compiler_for in https://github.com/python/cpython/blob/ee40e4b8563e6e1bc2bfb267da5ffc9a2293318d/Python/compile.c
  • I suspect that for has some sort of optimization for ranges. When instead of timing your two functions with range(1000000) I timed them with np.random.rand(1000000) the gap went down from 3x on my laptop (24 ms vs 73 ms) to 50% (101 ms vs. 155 ms).
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.