Skip main navigation

Histogram Overview

.
17
Now I want to talk about the histogram, which is the single most useful graph you’ll ever make in data analysis. It’s one graph to represent one variable. So let’s dive in. Imagine I’ve got data on the number of cups of coffee a number of people consume in a day. Here I’ve got that data on the screen. If I wanted to understand this data, I could eyeball the data. I could just look at it, try to draw some insights. It’s pretty confusing and laborious. I mean, it looks like I’ve got a bunch of 1s. But I would have to really effortfully think about this. And imagine if I had 1,000 rows or 10,000 rows. It’s just not doable.
57.7
In a histogram, I simply make a graph of these numbers. I have all of the different numbers of cups of coffee plotted on the x-axis. And then the height of the bar tells you how many people have that score. For instance, you can easily see in this graph that 1 is the most common number consumed. Most people in this data set consume 1 cup of coffee. Then 2 is the second most common score. And you see that represented in the bar as the second highest bar. So it is really just an easy way to visualise where all the scores are. In fact, this contains all the information. It tells me how many scores I have of each number.
100.6
It’s just sorted and visualised in a way that’s easy to read. One feature about histograms that is useful– which I haven’t shown you here, but you’ll get to see in the labs– if you’ve got a large range on that x-axis, sometimes we won’t make them individual numbers, but groups. Like I might have a range from, say, 0 to 2 and then a range from 2 to 4 to represent those bars. It just makes it a little easier if I’ve got a large range of levels for my x-axis. But we’ll get into that in the labs. Either way, it is the most useful graph you’ll ever make to quickly look at what your variable is doing without any calculation necessary.

Lesson 1: Histogram Overview and Normal Distribution

This lesson gives some hands-on experience with our first data visualization: the histogram. It’s much easier to understand what our data are saying when we have a visual aid to represent that data.

Lab: Histograms

With large sets of data, having a visual aid is essential if we want to understand what our data are saying. In this lab, we’ll make two different histograms based on data about coffee consumption and coffee temperature. If you’re a coffee drinker, it’s highly recommended that you have a fresh pot handy while doing these exercises.

The lab instructions can be downloaded as a PDF file here.

The data set for this lab can be viewed here. From the link, copy and paste all the data into a new worksheet in Excel Online.

This article is from the free online

Essential Mathematics for Data Analysis in Microsoft Excel

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now