Skip main navigation

Hurry, only 9 days left to get one year of Unlimited learning for £249.99 £174.99. New subscribers only. T&Cs apply

Find out more

NumPy: Essential operations

Learn about the essential operations that NumPy has to offer.

For the purposes of data analysis using Python, you need to understand the following essential operations that NumPy has to offer.

Vectorisation

NumPy Arrays provide vectorised mathematical operations and it’s beneficial to understand the concept of vectorisation. It has its roots in vector mathematics, but from a programming perspective it means that if data is stored in NumPy Arrays (vectors), it enables us to express the batch operations on the data without writing any loop. You simply express the mathematical operations as if you have scalar data types.

The code snippet explains the concept of vectorisation and shows that without using the loop functionality, we are able to perform addition on all the elements of ndArray using simple mathematical addition. Start with this code snippet:

a1 = np.array([[1,2,3], [4,5,6]])
a2 = np.array([[6,7,8], [9,10,11]])

Followed by:

a1, a1.shape, a1.ndim

Output:

(array([[1, 2, 3],
[4, 5, 6]]), (2, 3), 2)

Next set of codes:

a2, a2.shape, a2.ndim

Output:

(array([[ 6, 7, 8],
[ 9, 10, 11]]), (2, 3), 2)

Next set of codes:

a_sum = a1 + a2
a_sum, a_sum.shape, a_sum.ndim

Output:

(array([[ 7, 9, 11],
[13, 15, 17]]), (2, 3), 2)

Indexing and slicing

We touched on indexing and slicing when we explored various foundational data structures provided within Python.

In summary, indexing is used to access a particular element in the sequence, while slicing is used to access subsets of a sequence (more than one element).

The notation used for both operations is [ ] and passing different index parameters into square brackets. We will explore indexing and slicing in the context of NumPy Arrays later in this section.

The same principle of indexing and slicing applies to the NumPy Arrays as well, but with some subtle differences.

  • Indexing: By providing the index in square brackets, you can access the value stored at that position.
  • Slicing: By providing the slicing indexes, you can access the subset of the data stored between those indices.

Indexing examples

Accessing element in first row and third column, and element in the second row and first column

Code :

a_sum

Output:

array([[ 7, 9, 11],
[13, 15, 17]])

Code:

#Accessing element in the first row (index 0), and third column (index 2) i.e. 11
item1 = a_sum[0,2]
#Accessing element in the second row (index 0), and first column (index 0) i.e. 13
item2 = a_sum[1,0]
item1, item2

Output:

(11, 13)

Slicing example

The code snippet shows a two-dimensional ndArray created with three rows and four columns. By using the slicing operation we are accessing some subsets of the data:

  • First two columns of all rows ()
  • Slice parameter: [:,0:2]
  • First and third row only, and second and fourth columns only
  • Slice parameter: [0::2, 1::2]

Code:

n1 = np.array([[11,12,13,14], [15,16,17,18], [13,14,15,16]])
n1

Output:

array([[11, 12, 13, 14],
[15, 16, 17, 18],
[13, 14, 15, 16]])

Code:

n1.ndim, n1.shape

Output:

(2, (3, 4))

Code:

slice_data = n1[:,0:2]
slice_data

Output:

array([[11, 12],
[15, 16],
[13, 14]])

Code:

slice_data1 = n1[0::2, 1::2]
slice_data1

Output:

array([[12, 14],
[14, 16]])

Broadcasting

It’s important to understand this characteristic of slicing NumPy Arrays. Array slices are basically the views on the original Array; that is, when we perform a slicing operation, a new object is not created.

We can see this in the slicing example in the previous code snippet – the variables slice_data and slice_data1 are simply the references to the data stored in the original NumPy Array.

Code:

n1

Output:

array([[11, 12, 13, 14],
[15, 16, 17, 18],
[13, 14, 15, 16]])

This means that if we change the data using the Array slices, then the original source data will also change. This behaviour is called broadcasting – the method that NumPy uses to allow Array arithmetic between Arrays with a different shape or size.
The code snippet demonstrates broadcasting in practice. Here, we are changing the second and third columns of the first and third row by selecting them using slices, and equating them to a fixed value of 50.

Code:

n1[0::2,1::2] = 50
n1

Output:

array([[11, 50, 13, 50],
[15, 16, 17, 18],
[13, 50, 15, 50]])

Broadcasting is widely used in data analytics applications and the underlying programs so it’s a very important concept to master. It is used extensively in data manipulations and we’ll cover it more in the next course.

This article is from the free online

Python Packages: NumPy and Pandas Dataframe

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now