Skip main navigation

Why are Python programmes slow?

Python is an interpreted language that is very flexible and dynamic, but the flexibility comes with a price
© CC-BY-NC-SA 4.0 by CSC - IT Center for Science Ltd.

Python is very flexible and dynamic language, but the flexibility comes with a price.

Computer programmes are nowadays practically always written in a high-level human readable programming language and then translated to the actual machine instructions that a processor understands.

There are two main approaches for this translation:

  1. For compiled programming languages, the translation is done by a compiler before the execution of the program
  2. For interpreted languages, the translation is done by an interpreter during the execution of the program

Compiled languages are typically more efficient, but the behaviour of the programme during runtime is more static than with interpreted languages. The compilation step can also be time consuming, so the software cannot always be tested as rapidly during development as with interpreted languages.

Python is an interpreted language, and many features that make development rapid with Python are a result of that, with the price of reduced performance in some cases.

What is dynamic typing?

Python is a very dynamic language. As variables get type only during the runtime as values (Python objects) are assigned to them, it is more difficult for the interpreter to optimise the execution (in comparison, a compiler can make extensive analysis and optimisation before the execution).

Even though, in recent years, there has been a lot of progress in just-in-time (JIT) compilation techniques that allow programmes to be optimised at runtime, the inherent, very dynamic nature of the Python programming language remains one of its main performance bottlenecks.

Are its data structures flexible?

The built-in data structures of Python, such as lists and dictionaries, are very flexible, but they are also very generic, which makes them not so well suited for extensive numerical computations.

Actually, the implementation of the data structures (e.g. in the standard CPython interpreter) is often quite efficient when one needs to process different types of data. However, when one is processing only a single type of data, for example only floating point numbers, there is a lot of unnecessary overhead due to the generic nature of these data structures.

What is parallel processing?

The performance of a single CPU core has stagnated over the last ten years, and as such most of the speed-up in modern CPUs is coming from using multiple CPU cores, i.e. parallel processing.

Parallel processing is normally based either on multiple threads or multiple processes. Unfortunately, the memory management of the standard CPython interpreter is not thread-safe, and it uses something called Global Interpreter Lock (GIL) to safeguard memory integrity.

In practice, this limits the benefits of multiple threads only to some special situations (e.g. I/O). Fortunately, parallel processing with multiple processes is relatively straightforward also with Python.

In summary, the flexibility and dynamic nature of Python, which enhances the programmer productivity greatly, is also the main cause for the performance problems. Flexibility comes with a price! Fortunately, many of the bottlenecks can be circumvented.

© CC-BY-NC-SA 4.0 by CSC - IT Center for Science Ltd.
This article is from the free online

Python in High Performance Computing

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now