£199.99 £139.99 for one year of Unlimited learning. Offer ends on 14 November 2022 at 23:59 (UTC). T&Cs apply

Find out more
Avoiding function call overheads
Skip main navigation

Avoiding function call overheads

Function calls in Python are relatively expensive compared to some other languages. In this article we show how Cython can reduce these overheads.
© CC-BY-NC-SA 4.0 by CSC - IT Center for Science Ltd.

Function calls in Python are relatively expensive in comparison to for example C. Cython allows one to reduce this overhead significantly.

Main reason for the large function call overhead in Python is the “boxing” and “unboxing” of function call arguments and return values. Once again, dynamic typing makes programming more flexible but with a cost in performance.

One can investigate the function call overhead in Python with the following example. Let’s make a simple code that does not perform much else than calls a function in a loop:

def inner(i):
return i+1

def outer_1(n):
x = 0
for i in range(n):
x = inner(x)

We can create also a version which does not have the function call:

def outer_2(n):
x = 0
for i in range(n):
x = x + 1

If one measures the performance of the two outer functions, one observes
that the one with the function call in the loop is typically 3-4 times slower.

Using pure C functions

If a function is used only within a Cython module, one can get rid of a large
part of Python’s function call overhead by declaring the function as a pure C
function, once again using the cdef keyword:

cdef int add (int x, int y):
cdef int result
result = x + y
return result

When a function is defined as a pure C function, it can be called only from
the corresponding Cython module, but not from a Python code. If a function is
called both from Cython and Python, Cython can generate an additional Python
wrapper by declaring the function with cpdef instead of cdef:

cpdef int add (int x, int y):
cdef int result
result = x + y
return result

This adds some overhead, so if the function is not called from Python it is
better to use just cdef.

Mandelbrot kernel as pure C function

The kernel function is called only within the mandelbrot.pyx module,
so we can make it a pure C function:

cdef int kernel(zr, zi, cr, ci, lim, cutoff):
''' Computes the number of iterations `n` such that
|z_n| > `lim`, where `z_n = z_{n-1}**2 + c`.

count = 0
while ((zr*zr + zi*zi) < (lim*lim)) and count < cutoff:
zr, zi = zr * zr - zi * zi + cr, 2 * zr * zi + ci
count += 1
return count

When continuing with the performance investigation, we obtain the following results:

  • Pure Python: 0.57 s
  • Static type declarations in the kernel: 14 ms
  • Kernel as C-function: 9.6 ms

In this case the speed-up is not so drastic, but still a respectable 1.5.

© CC-BY-NC-SA 4.0 by CSC - IT Center for Science Ltd.
This article is from the free online

Python in High Performance Computing

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education