A function machine labelled 'mergesort'. It has a tube for an input coming in from the left side of the image into the left of the machine, and an output going out from the right. Another tube loops around from the right output side to the left input side

Exploring Merge Sorts

Implementing merge sort using recursion

Now that you’ve seen how merge sort works, we will create a couple of functions which will work together to perform a merge sort.

As you saw in the video, a key aspect of the merge sort algorithm is the ability to merge two sorted lists into a single sorted list. We can start by creating a function to do just this.

def merge(list_a, list_b):

Within this function, let’s create an empty list, into which the two other lists will be merged.

def merge(list_a, list_b):
  list_sorted = []

Now we’re going to start comparing the heads (the initial items) of list_a and list_b, and moving the smallest value into list_sorted. We want to keep doing this until both lists are empty.

So for example, given these lists:

list_a = [3,7,8]
list_b = [2,4,9]
list_sorted = []

The first time around, we would want the function to compare the values 3 and 2. As 2 is smaller, it gets moved into list_sorted, so the lists will now look like this:

list_a = [3,7,8]
list_b = [4,9]
list_sorted = [2]

Then 3 and 4 are compared, and 3 is smaller so it is appended to list_sorted:

list_a = [7,8]
list_b = [4,9]
list_sorted = [2,3]

This is repeated until either of the lists is empty.

How will we implement this? Let’s start with a loop, checking that both loops are not empty.

def merge(list_a, list_b):
  list_sorted = []
  while list_a != [] and list_b != []:

Now compare the heads of each list:

def merge(list_a, list_b):
  list_sorted = []
  while list_a != [] and list_b != []:
    if list_a[0] < list_b[0]:

If the item in list_a is the smaller of the two values, then we can pop it out of the list and into list_sorted.

def merge(list_a, list_b):
  list_sorted = []
  while list_a != [] and list_b != []:
    if list_a[0] > list_b[0]:
      list_sorted.append(list_a.pop(0))

If the item is larger, then we need to pop the item out of list_b and into list_sorted instead.

def merge(list_a, list_b):
  list_sorted = []
  while list_a != [] and list_b != []:
    if list_a[0] > list_b[0]:
      list_sorted.append(list_a.pop(0))
    else:
      list_sorted.append(list_b.pop(0))

If one of the lists becomes empty, then what remains in the other list (which is already sorted) can be added to the end of list_sorted, and then list_sorted can be returned.

def merge(list_a, list_b):
    list_sorted = []
    while list_a != [] and list_b != []:
        if list_a[0] < list_b[0]:
            list_sorted.append(list_a.pop(0))
        else:
            list_sorted.append(list_b.pop(0))
    if len(list_a) < 1:
        list_sorted += list_b
    else:
        list_sorted += list_a
    return list_sorted

Let’s try out the merge function. Run your code, and then pass the function a couple of sorted lists.

>>> merge([3,7,8],[2,4,9])
[2, 3, 4, 7, 8, 9]
>>>

Now that we can merge lists together, let’s tackle the other part of the algorithm. We need to split the large unsorted list into two, then do the same again and again and again, until we have a bunch of lists each with only one item. Then we can use the merge function to keep combining the lists together.

So for the following list, it would look like this:

[4,7,3,1,9,3,1]
    /     \
[4,7,3][1,9,3,1]
  /  \    /   \
[4][7,3][1,9][3,1]
 /  / \   / \   / \
[4][7][3][1][9][3][1]

Whenever you face a problem in programming where the solution seems to break down into smaller and repeating tasks, it’s probably a good idea to use a recursive strategy. Recursion is when a function is called by itself!

Let’s have a look at how this works.

We start with what is called the base case. this is when the algorithm will come to an end.

def merge_sort(unsorted):
    if len(unsorted) <= 1:
      return unsorted

So here we are returning any list that is either empty or contains only one item, as by definition these lists must be sorted.

In all other cases there are at least two elements in the list, so next we want to split the list into two smaller lists. To do this, we can find the middle point in the list and then take slices from the first element up to the middle, and the middle element to the end.

def merge_sort(unsorted):
  if len(unsorted) <= 1:
      return unsorted
  else:
    middle = int(len(unsorted)) / 2 # rounding down if the list has an odd number of items
    front = unsorted[:middle]
    back = unsorted[middle:]

Now the lists have been split. Next they need splitting again, unless they are only one item long. So the merge_sort function is used again, this time on these split lists.

def merge_sort(unsorted):
  if len(unsorted) <= 1:
      return unsorted
  else:
    middle = int(len(unsorted) / 2)
    front = merge_sort(unsorted[:middle])
    back = merge_sort(unsorted[middle:])

This will keep splitting the lists until each list is either one or zero items in length.

At this point you want to merge the lists back together, and to do this you can use the merge function you have already written.

def merge_sort(unsorted):
  if len(unsorted) <= 1:
      return unsorted
  else:
    middle = int(len(unsorted) / 2)
    front = merge_sort(unsorted[:middle])
    back = merge_sort(unsorted[middle:])
  return merge(front, back)

This function returns a sorted list each time it is called. This may result in additional calls of the merge_sort function with parts of the list. Each call of the function returns a sorted list until the uppermost one is complete. For example, where the merge_sort function is called in the line front = merge_sort(unsorted[:middle]), then several other calls are made until eventually a sorted list is returned and stored in the front variable. The function would then move onto running a merge sort on the second half of the longer list.

So to summarise, this function keeps splitting a list until there are a set of lists each 1 or 0 items in size. It then repeatedly merges those lists together, but in order.

If you want to try and get a clearer picture of what the code is doing, then it’s worth adding a few print statements. Here’s a slightly edited version of the merge sort program that you can use. You might notice the introduction of the // operator, this is known as floor division and carries out the division but only retains the integer part. This is functionally the same as int(len(unsorted) / 2), just shorter.

def merge(list_a, list_b):
    list_c = []
    while len(list_a) > 0  and len(list_b) > 0:
        if list_a[0] < list_b[0]:
            list_c.append(list_a.pop(0))
        else:
            list_c.append(list_b.pop(0))
    if list_a == []:
        list_c += list_b
    else:
        list_c += list_a
    print("merged list is", list_c)
    return list_c


def merge_sort(unsorted):
    if len(unsorted) < 2:
        return unsorted
    else:
        middle = len(unsorted) // 2
        front = unsorted[:middle]
        back = unsorted[middle:]
        print("splits are", front, back)
        front = merge_sort(front)
        back = merge_sort(back)
    return merge(front, back)


my_list = [5,4,3,2,1]
merge_sort(my_list)

The output show the order in which lists are split into sublists and then merged. You should get something like the following:

splits are [5, 4] [3, 2, 1]
splits are [5] [4]
merged list is [4, 5]
splits are [3] [2, 1]
splits are [2] [1]
merged list is [1, 2]
merged list is [1, 2, 3]
merged list is [1, 2, 3, 4, 5]

Share this article:

This article is from the free online course:

Programming 102: Think Like a Computer Scientist

Raspberry Pi Foundation