Skip main navigation

How do you reshape a data set?

Data analysts must possess the ability and tools to look at data from different layouts and orientations in order to draw solid insights.

To begin with, let us define the ‘shape’ of a data set. The shape of a data set refers to the way in which a data set is arranged into rows and columns, and reshaping data is the rearrangement of the data without altering the content of the data set. Reshaping data sets is a very frequent and cumbersome task in the process of data manipulation and analysis. Data analysts must possess the ability and tools to look at data from different layouts and orientations in order to draw solid insights.

Reshaping data sets in Python

Python offers multiple functions to reshape data sets and so let’s explore two of these.

    • stack(): reshapes the DataFrame by converting the data into stacked form, that means pivoting the innermost column index into the innermost row index.
    • unstack(). reshapes the DataFrame by pivoting the innermost row index back into the innermost column index.

Here is the syntax for both the functions:

DataFrame.stack(level=- 1, dropna=True)
DataFrame.unstack(level=- 1, fill_value=None)

stack() function

Let’s try these operations with some examples. Use these code snippets:

First, create a dummy DataFrame.

Code:

data = pd.DataFrame(np.arange(6).reshape((2,3)),
 index=pd.Index(['Victoria', 'NSW'], name='state'),
 columns=pd.Index(['one','two','three'], name='number'))
data

Output:

Graphic shows a table. Y-axis labels read state, VIC, NSW, and the X-axis across the top reads one, two, three. The row for VIC reads 0, 1, 2. The row for NSW reads 3, 4, 5.

Next, we use the stack() function and we will pivot the columns into rows.

Code:

data_stack = data.stack()
data_stack

Output:

Graphic shows a table. Y-axis labels VIC, NSW. X-axis reads state and number. Each row reads VIC, one, 0; VIC, two, 1; VIC, three, 2; NSW, one, 3; NSW two, 4; NSW, three, 5. The final row of the table reads "dtype: int32"

You can see that:

    • the operation converted the columns to row labels
    • the operation converted the DataFrame to a series.

You can confirm these changes with this code:

Code:

type(data_stack)

Output:

pandas.core.series.Series

Code:

data_stack.index

Output:

MultiIndex(levels=[[‘Victoria’, ‘NSW’], [‘one’, ‘two’, ‘three’]],
labels=[[0, 0, 0, 1, 1, 1], [0, 1, 2, 0, 1, 2]],
names=[‘state’, ‘number’])

unstack() function

 

You can reshape a stacked DataFrame back to its unstacked format with the unstack() function.

Try this code:

Code:

data = data_stack.unstack()
data

Output:

Graphic shows a table. Y-axis labels read state, VIC, NSW, and the X-axis across the top reads one, two, three. The row for VIC reads, 0, 1, 2. The row for NSW reads 3, 4, 5.

By default, the innermost level is unstacked. In our example, it was a number. However, you can unstack a different level by passing a level number or name as a parameter to the unstack() method.

For example, try this code that unstacks data_stack at the level of state, rather than number:

Code:

data_state = data_stack.unstack('state')
data_state

Output:

Graphic shows a table. Y-axis labels read one, two, three, and the X-axis across the top reads state, VIC, NSW. Row one reads: 0, 3. Row two reads: 1, 4. Row three reads: 2, 5.

This article is from the free online

Introduction to Data Analytics with Python

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education