Learn more about this course.

Temporary arrays

In this article we show how NumPy uses temporary arrays to evaluate
expressions.

In complex expressions, NumPy stores intermediate values in temporary arrays.

This means that the memory consumption can be higher than expected. Consider e.g. the following example:

import numpy
a = numpy.random.random((1024, 1024, 50))
b = numpy.random.random((1024, 1024, 50))

c = 2.0 * a - 4.5 * b

In order to calculate the last line, two temporary arrays will be created to
store the intermediate results (2.0 * a and 4.5 * b). If the arrays are
very large, it is easy to see this leading to unexpected out-of-memory errors
for the unwary.

Want to keep
learning?

This content is taken from
Partnership for Advanced Computing in Europe (PRACE) online course,

Python in High Performance Computing

View Course

Luckily, NumPy is smart enough to reuse temporary arrays when possible. Thus,
even if we have additional terms in the addition, only two temporary arrays
are needed:

c = 2.0 * a - 4.5 * b + numpy.sin(a) + numpy.cos(b)

Now, if one adds some (unnecessary) parenthesis, the situation changes and
three temporary arrays are needed:

c = 2.0 * a - 4.5 * b + (numpy.sin(a) + numpy.cos(b))

To alleviate for this, we could either remove the unnecessary parenthesis or
we could move the parenthesis to be first term in the addition, which would
allow for better reuse of the temporary arrays:

c = (numpy.sin(a) + numpy.cos(b)) + 2.0 * a - 4.5 * b

Sometimes it is hard to see how many temporary arrays are needed, but if one
wants to conserve memory (when working with very, very large arrays), it is
usually a good idea to do apply operations on an existing array one by one
instead.

c = 2.0 * a
c -= 4.5 * b
c += np.sin(a)
c += np.cos(b)

Broadcasting and temporary arrays

Broadcasting approaches can also lead to unexpected temporary arrays. For
example, let us consider the calculation of the pairwise distance of M
points in three dimensions.

Input data is a M x 3 array and output is a M x M array containing the
distances between points i and j.

X = np.random.random((1000, 3))
D = np.sqrt(((X[:, np.newaxis, :] - X) ** 2).sum(axis=-1))
# ^^^^^^^^^^^^^^^^^^^^^^^^^
# temporary 1000x1000x3 array

Want to keep learning?

This content is taken from Partnership for Advanced Computing in Europe (PRACE) online course

Python in High Performance Computing

View Course

See other articles from this course

This article is from the free online

Python in High Performance Computing

Created by

Join Now

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now

Learn more about this course.

Temporary arrays

Want to keep
learning?

Python in High Performance Computing

Broadcasting and temporary arrays

Want to keep learning?

Python in High Performance Computing

Python in High Performance Computing

Python in High Performance Computing

Reach your personal and professional goals

Register to receive updates

Learn more about this course.

Learn more about this course.

See all FutureLearn courses.

Learn more about this course.

Temporary arrays

Want to keep learning?

Python in High Performance Computing

Broadcasting and temporary arrays

Want to keep learning?

Python in High Performance Computing

Share this

Python in High Performance Computing

Python in High Performance Computing

Reach your personal and professional goals

Register to receive updates

Learn more about this course.

Learn more about this course.

See all FutureLearn courses.

Want to keep
learning?