Skip main navigation

New offer! Get 30% off one whole year of Unlimited learning. Subscribe for just £249.99 £174.99. New subscribers only. T&Cs apply

Find out more

Merge() parameters

Merge () parameters

Now, let’s explore some examples using various parameters in the merge function.

Example 1: Inner joins

Look at these dataframes:

Graphic shows example of “Python merge inner”. There is a table for “left”, “right”, and “Result”. “left” chart x-axis from left to right reads: Key 1, Key 2, A, B. Y-axis from top to bottom reads: 0, 1, 2, 3. “right” chart x-axis from left to right reads: Key 1, Key 2, C, D. Y-axis from top to bottom reads: 0, 1, 2, 3. “Result” chart x-axis from left to right reads: Key 1, Key 2, A, B, C, D. Y-axis from top to bottom reads: 0, 1, 2. Click to enlarge

Notice that both the left and right dataframes have the same key columns with the same values. If you do an inner join on the key column, you will get a Dataframe that has columns A and B from the left dataframe, and columns C and D from the right dataframe.

Try this code snippet in your Jupyter Notebook to see the process:

Code:


#Create DataFrames
left = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
'A': ['A0', 'A1', 'A2', 'A3'],
'B': ['B0', 'B1', 'B2', 'B3']})

right = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
'C': ['C0', 'C1', 'C2', 'C3'],
'D': ['D0', 'D1', 'D2', 'D3']})

#Merge DataFrame, perform Inner Join
pd.merge(left,right, on='key')

Output:

Graphic shows a table. Y-axis labels from top to bottom reads: 0-3. X-axis from left to right reads: Key, A, B, C, D. The cells across row 0 read: K0, A0, B0, C0, D0. The cells across row 1 read: K1, A1, B1, C1, D1. The cells across row 2 read: K2, A2, B2, C2, D2. The cells across row 3 read: K3, A3, B3, C3, D3.
Click to enlarge

Example 2: Inner join with multi keys

Take a look at these dataframes:

Graphic shows example of “Python merge right join”. There is a table for “left”, “right”, and “Result”. “left” chart x-axis from left to right reads: Key 1, Key 2, A, B. Y-axis from top to bottom reads: 0, 1, 2, 3. “right” chart x-axis from left to right reads: Key 1, Key 2, C, D. Y-axis from top to bottom reads: 0, 1, 2, 3. “Result” chart x-axis from left to right reads: Key 1, Key 2, A, B, C, D. Y-axis from top to bottom reads: 0, 1, 2.
Click to enlarge

Notice that:

  • Both original dataframes have columns ‘key1’ and ‘key2’. If we perform an inner join on the two keys, the operation will use the combination of keys.

  • The key pair (K0, K0) occurs in both original dataframes once, so it will be in the resulting dataframe once and the columns will be merged.

  • The key pairs (K0, K1 and K2, K1) are not in the right-hand dataframe, so they won’t be in the resulting dataframe.

  • Key pair (K1, K0), from the left-hand dataframe, occurs twice in the right-hand dataframe, so we will get two rows with this key in the resulting dataframe.

Try out the process with this code snippet:

Code:


left = pd.DataFrame({'key1': ['K0', 'K0', 'K1', 'K2'],
'key2': ['K0', 'K1', 'K0', 'K1'],
'A': ['A0', 'A1', 'A2', 'A3'],
'B': ['B0', 'B1', 'B2', 'B3']})

right = pd.DataFrame({'key1': ['K0', 'K1', 'K1', 'K2'],
'key2': ['K0', 'K0', 'K0', 'K0'],
'C': ['C0', 'C1', 'C2', 'C3'],
'D': ['D0', 'D1', 'D2', 'D3']})

result = pd.merge(left,right, on=['key1','key2'])
result

Output:

Graphic shows a table. Y-axis labels from top to bottom reads: 0-1. X-axis from left to right reads: Key 1, Key 2, A, B, C, D. The cells across row 0 read: K0, K0, A0, B0, C0, D0. The cells across row 1 read: K1, K0, A2, B2, C1, D1. The cells across row 2 read: K1, K0, A2, B2, C2, D2.
Click to enlarge

The how parameter determines how the join operation occurs (ie, which keys to include in the resulting table). If a key combination does not appear in either the left or right tables, the values in the joined table will be NA.

This table summarises join operations:

Merge method Equivalent SQL operation Description
left LEFT OUTER JOIN Use keys from the left frame only
right RIGHT OUTER JOIN Use keys from the right frame only
outer FULL OUTER JOIN Use a union of keys from both frames
inner INNER JOIN Use an intersection of keys from both frames

Example 3: Left join

This diagram illustrates a left join:

Graphic shows example of “Python merge left join”. There is a table for “left”, “right”, and “Result”. “left” chart x-axis from left to right reads: Key 1, Key 2, A, B. Y-axis from top to bottom reads: 0, 1, 2, 3. “right” chart x-axis from left to right reads: Key 1, Key 2, C, D. Y-axis from top to bottom reads: 0, 1, 2, 3. “Result” chart x-axis from left to right reads: Key 1, Key 2, A, B, C, D. Y-axis from top to bottom reads: 0, 1, 2, 3, 4.
Click to enlarge

Notice that:

  • All the key pairs from the left-hand dataframe are available

  • There are NA values in columns C and D if the key pair is not available in the right-hand dataframe

  • The key pair (K1, K0) appears twice, because it appears twice in the right-hand dataframe (with different values in columns C and D).

Let’s use code to implement this join:

Code:


result=pd.merge(left,right, how='left', on=['key1','key2'])
result

Output:

Graphic shows a table. Y axis labels reads 0-4, and the X axis across the top reads key1, key2, A, B, C, D. The cells across row 0 reads: K0, K0, A0, B0, C0, D0. The cells across row 1 reads: K0, K1, A1, B1, NAN, NAN. The cells across row 2 reads: K1, K0, A2, B2, C1, D1. The cells across row 3 reads: K1, K0, A2, B2, C2. D1. The cells across row 4 reads: K2, K1, A3, B3, NAN, NAN.
Click to enlarge

Example 4: Right join

Here is an example of a right join:

Graphic shows example of “Python merge right join”. There is a table for “left”, “right”, and “Result”. “left” chart x-axis from left to right reads: Key 1, Key 2, A, B. Y-axis from top to bottom reads: 0, 1, 2, 3. “right” chart x-axis from left to right reads: Key 1, Key 2, C, D. Y-axis from top to bottom reads: 0, 1, 2, 3. “Result” chart x-axis from left to right reads: Key 1, Key 2, A, B, C, D. Y-axis from top to bottom reads: 0, 1, 2, 3.
Click to enlarge

As you can see, all the key pairs from the right-hand dataframe are available. Columns A and B have NA values where the key pair is not available in the right-hand dataframe.

Here is some code to implement the right join:

Code:


result=pd.merge(left,right, how='right', on=['key1','key2'])
result

What output did you get?

What output did you receive on the Jupyter notebook? Share in the Comments below.

Example 5: Outer join

Look at this diagram:

Graphic shows example of “Python merge outer”. There is a table for “left”, “right”, and “Result”. “left” chart x-axis from left to right reads: Key 1, Key 2, A, B. Y-axis from top to bottom reads: 0, 1, 2, 3. “right” chart x-axis from left to right reads: Key 1, Key 2, C, D. Y-axis from top to bottom reads: 0, 1, 2, 3. “Result” chart x-axis from left to right reads: Key 1, Key 2, A, B, C, D. Y-axis from top to bottom reads: 0, 1, 2, 3, 4, 5.
Click to enlarge

Notice that all the key pairs from the left- as well as right-hand dataframe are present in the result. The NaN values in the columns correspond to key pairs that weren’t present in one of the original dataframes.

This code snippet demonstrates the operation:

Code:


result=pd.merge(left,right, how='outer', on=['key1','key2'])
result

Output:

Graphic shows a table. Y axis labels reads 0-5, and the X axis across the top reads key1, key2, A, B, C, D. The cells across row 0 reads: K0, K0, A0, B0, C0, D0. The cells across row 1 reads: K0, K1, A1, B1, NAN, NAN. The cells across row 2 reads: K1, K0, A2, B2, C1, D1. The cells across row 3 reads: K1, K0, A2, B2, C2. D2. The cells across row 4 reads: K2, K0, NAN, NAN, C3, D3. Click to enlarge

You have already seen examples of inner joins in Examples 1 and 2, which is the default behaviour of merge(). There are many more possible variations of the merge() and join().

Go to: Merge, join, concatenate and compare [1]

Reference

1. Pandas User Guide: Merge, join, concatenate and compare. Pandas; [date unknown]. Available from: https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html

This article is from the free online

Data Wrangling and Ingestion using Python

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now