How do you reshape a data set?

Data analysts must possess the ability and tools to look at data from different layouts and orientations in order to draw solid insights.
To begin with, let us define the ‘shape’ of a data set. The shape of a data set refers to the way in which a data set is arranged into rows and columns, and reshaping data is the rearrangement of the data without altering the content of the data set. Reshaping data sets is a very frequent and cumbersome task in the process of data manipulation and analysis. Data analysts must possess the ability and tools to look at data from different layouts and orientations in order to draw solid insights.

Reshaping data sets in Python

Python offers multiple functions to reshape data sets and so let’s explore two of these.
• stack(): reshapes the DataFrame by converting the data into stacked form, that means pivoting the innermost column index into the innermost row index.
• unstack(). reshapes the DataFrame by pivoting the innermost row index back into the innermost column index.
Here is the syntax for both the functions:
DataFrame.stack(level=- 1, dropna=True)
DataFrame.unstack(level=- 1, fill_value=None)

stack() function

Let’s try these operations with some examples. Use these code snippets:
First, create a dummy DataFrame.
Code:
data = pd.DataFrame(np.arange(6).reshape((2,3)),
index=pd.Index(['Victoria', 'NSW'], name='state'),
columns=pd.Index(['one','two','three'], name='number'))
data
Output:
Next, we use the stack() function and we will pivot the columns into rows.
Code:
data_stack = data.stack()
data_stack
Output:
You can see that:
• the operation converted the columns to row labels
• the operation converted the DataFrame to a series.
You can confirm these changes with this code:
Code:
type(data_stack)
Output:
pandas.core.series.Series
Code:
data_stack.index
Output:
MultiIndex(levels=[[‘Victoria’, ‘NSW’], [‘one’, ‘two’, ‘three’]],
labels=[[0, 0, 0, 1, 1, 1], [0, 1, 2, 0, 1, 2]],
names=[‘state’, ‘number’])

unstack() function

You can reshape a stacked DataFrame back to its unstacked format with the unstack() function.
Try this code:
Code:
data = data_stack.unstack()
data
Output:
By default, the innermost level is unstacked. In our example, it was a number. However, you can unstack a different level by passing a level number or name as a parameter to the unstack() method.
For example, try this code that unstacks data_stack at the level of state, rather than number:
Code:
data_state = data_stack.unstack('state')
data_state

Output: