Skip main navigation

New offer! Get 30% off your first 2 months of Unlimited Monthly. Start your subscription for just £35.99 £24.99. New subscribers only T&Cs apply

Find out more

Big O notation

Big O notation is a way to describe how the time complexity/efficiency of an algorithm scales with the size of the data that it is processing.
Hi, and welcome back to the course. So now, we’re going to change subjects a little bit. And we want to talk about something called time complexity, or time efficiency, and also, Big O rotation. So Big O notation is just the notation you use to describe the time complexity of an algorithm on a data structure. And if you were taking a computer science course, or getting a degree in computer science, this would be a big topic, and you would discuss the different algorithms and different data structures in a more general sense, meaning, you wouldn’t just be focused on Python. You would also talk about data structures that are maybe implemented in other languages differently than they are in Python.
And so it would be more of a high level discussion about the different algorithms and data structures. And that’s great. That’s very good. But in this course, we’re a little more focused on Python. So what we’re going to do is introduce the idea of Big O notation with some examples. And then we will very quickly jump into Python to discuss how the time complexity of the specific Python data structures, like a list, or a site, or a dictionary. So the first thing we want to do is talk about the Big O notation. What is it? And why do we care?
So to introduce it, let’s go over two examples. These are just imaginary examples. In the first example, I give you a dictionary with thousand words in it. But the words are not alphabetical. In fact, they’re random. And then I ask you to look up the definition of a word in the dictionary. So how would you have to do this? You don’t know where the word is in the dictionary. It’s not alphabetical. It’s completely random. So you would have to just start with the first word and go through every single word until you found the word, and then you could give me the definition.
Or you go through the whole dictionary, and tell me that this word is not in the dictionary. So if it’s 1,000 word dictionary, it would take you 1,000 steps, right? You have to look at each word, and see is it the word or not. If I were to give you a million word dictionary, it would take a million steps. In the worst case scenario, it would take a million steps. So worst case scenario is that the word you’re looking for is the very last word in the dictionary, or it’s not even in there, in which case, you would still go through all the words.
And oftentimes, when we talk about time complexity of algorithms, we will often talk about the worst case. So in this example, the algorithm is you just looking at each word in the dictionary. And if it was one million word dictionary, it would take a million steps, or the worst case would be a million steps. So in general, you could say if the dictionary has n words, where n is just a placeholder for any number. Then the algorithm would take n steps, right? So 100 words, 100 steps, one million words, one million steps. ten million words, ten million steps. OK, now, let’s go to another example, the much better example. You have a dictionary. I give you a dictionary.
And this time, it’s alphabetically ordered. And it also has those nice tabs on the side. So you can flip really quickly to the section you want. In fact, it has the tabs that include two letters. Meaning, it will have an AA tab, an AB tab, an AC tab. So if the first two letters of the word start with AA, you can flip almost straight to it. And we’ll also say that you’re really good at judging which page you need to flip to. So you use the tabs. And if it’s in the middle between two tabs, you’re really good at flipping right to the middle. OK. So now, I ask you to look up a word.
And how many steps does it take you? Well, it takes about one step, because you can flip right to the word. It’s super easy. You don’t have to go through every single word. I ask you to look up zebra. And you go to the ZE tab. You flip right to it. And now, if I give you a word with one million - or sorry - I give you a dictionary with one million words in it, it still takes you just one step. If I give you a dictionary with ten million words in it, it’s still one step, because it has all the tabs, and you’re really good at flipping to the right page.
So there’s a huge difference between these two examples, right? One is horribly slow. And not only that, but it scales. The number of steps you have to take scales as the number of words in the dictionary grows, right? In the second example, it doesn’t scale. You could give me 100 million word dictionary, and it’s still about one step. Now, I know in real life, right, you flip to where you think the page is, and then you probably have to flip a couple of pages to the left or right. But still, it’s essentially the same amount of time no matter how big the dictionary is. And that difference is what we want to describe. Because it’s a huge difference.
So how would we describe that? We would use Big O notation.
So Big O notation - I’m going to highlight some text down here - how do you write Big O notation? And first, I’m going to tell you how to write it. And then I’ll back out like why we write it like this. I think it’ll make a little more sense. But first, we’ll just go right to how you would write Big O, what is this. Well, you write a capital O, and that O is short for order. And then you write in parentheses the order of n that the algorithm scales with where n is the size of the data that you’re running the algorithm on.
So in our first example, actually, before I say that, when I say the order of n, that means that there’s some nonlinear function of n that describes how the algorithm, how the steps in the algorithm’s scale with n. I know that can sound confusing. But let’s consider the power of n. So the power of n is non-linear. It’s a nonlinear function, power of n. And so let’s just go through some of the powers of n right down here. So we have n raised to the zeroth power, which is just one. So any number raised to the zeroth power is one. We have n raised to the first power, which is just n right? N to the first power is N.
Any number raised to the first power is n. N to the second power is n squared. N to the third power is n cubed, and so on and so forth. So if we were just going to describe these two examples as a power of n, how would we describe them? And it’s really straightforward. It might be confusing to talk about powers, but these examples are really straightforward. An example of one, we already said that it takes the same amount of steps as the words in the dictionary. So if there’s n words in a dictionary, it takes you n steps. And again, in the worst case scenario where we have to go through every single word.
So the power of n that it scales with, or I should say the order of n that it scales with is just n, or n to the first power. But the order is just n. So if we were going to write the Big O notation, it would just be Big O, and then in parentheses, n. So the order of n that it scales with is just n. It’s not n squared. It’s not n cubed, right? It’s not anything else. It’s just n. Now, the second example is one step no matter how big the dictionary gets. So it just one step. So you write O of one. And one is the same as n the zeroth power, right?
If you wanted to be more complicated, you could write O to the n raised to the zeroth power. And that means that no matter how big m gets, you just raise it to the zeroth power, and it’s a one. So it doesn’t scale with n at all. Its constant time. Another way to say the Big O one is just constant time. The algorithm takes the same amount of time no matter how big the data gets.
So that was a quick and dirty introduction to Big O notation. And there are other. So actually, we’ll see it in a coming lecture. We’ll see the other kind of common orders. But O one, order one is common. And O of n is common. And O n squared are also common. So now, in the coming lectures, we’ll now look at some more examples, so we can really feel comfortable about Big O notation.

Big O notation is a way to describe how the time complexity/efficiency of an algorithm (or function) scales with the size of the data that it is processing.

In this video, you will be introduced to the idea of Big O notation, and you will also follow along with some examples to explain this.

Follow along

The file used in this video is Introducing Time Efficiency and Big O Notation.ipynb. Please download this file from the Downloads section below.
Make sure you are able to access it, in order to follow along with the video.

Join the discussion

At the beginning of this video, the question was: What is Big O notation, and why should we care about it? After watching this video, what are your thoughts on this?
Use the Comments section below and let us know your thoughts. Try to respond to at least one other post and once you are happy with your contribution, click the Mark as complete button to check the step off, then you can move to the next step.

In the next step, we will look at more examples that will illustrate why thinking of time complexity can be so important.

This article is from the free online

Intermediate Python

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now