Skip main navigation

New offer! Get 30% off your first 2 months of Unlimited Monthly. Start your subscription for just £35.99 £24.99. New subscribers only T&Cs apply

Find out more


Next we’re going to talk about a few different ways to calculate an average. Now we’ve already talked about the mean, and we’ll talk about it again, but there are other ways you can describe what a typical score or an average score is doing. So let’s talk through a few of those, and we’re going to start with the mode. The mode is simply the most common score. And when you think about it, that’s actually not the most helpful thing to use.
This is a distribution of age scores from one of my online studies, and we clearly see here the most common score is just above 30, but you also can imagine that that doesn’t really represent what the majority of scores are doing. In fact the middle of the data seems to be closer to around 35, maybe even up toward 40. You can also imagine that if just a few scores shifted around that mode could shift around a little bit. After all those bars are of different heights, and it wouldn’t take a lot of scores shifting around from maybe one sample to the next, or one set of data to the next for the mode to jump around quite a bit.
We call this sampling stability, and the mode has terrible sampling stability. From one data set to the next it tends to be unstable. The last thing I want to say about the mode is it ignores a lot of data. It really doesn’t pay attention to what any of the other scores are doing, it just tells you what the most common score is. And so it’s not particularly useful. In fact, I would avoid using it in situations like this. So then why use it at all? Well, in a lot of cases you don’t have another option. Here we have categorical data, here on the screen. I have sex, male and female counts for a couple of different samples.
There’s no way to calculate an average for a categorical variable. It just is what it is. I don’t have any other option, so I have to use the mode in this example. But I just want to point out how weak this measure really is. Here’s two different scenarios, and the mode in both samples is male. On the left male clearly represents 80% of the sample. On the right it represents just over 50%. But the male is the most common score, so it’s the mode in both cases. So a cautionary note for you against relying too heavily on the mode.
We don’t really have another option when we have categories, but it’s not the most useful or informative statistic that we could calculate.

Lesson 1: Mode, Mean, Median

In this lesson, we’ll look at the three m‘s: mode, mean, and median. They each measure the “center” of a data set in a slightly different way, and the usefulness of each of these summary statistics depends on the type of data we’re dealing with.

Lab: Mean, Median, Mode

The mean, median, and mode each use different logic to measure the “center” of a data set. The usefulness of each measure depends on the type of data and the level of skew. In this lab, we’ll use Excel to estimate the mean, median, and mode of the coffee data we used in the last module, and we’ll see which measure of center gives us the most useful information about this particular data set.

The lab instructions can be downloaded as a PDF file here.

The data set for this lab can be viewed here. From the link, copy and paste all the data into a new worksheet in Excel Online.

(Note: This lab uses the same data set you used in the labs for the previous module.)

This article is from the free online

Essential Mathematics for Data Analysis in Microsoft Excel

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now