Skip main navigation

New offer! Get 30% off your first 2 months of Unlimited Monthly. Start your subscription for just £35.99 £24.99. New subscribers only T&Cs apply

Find out more

Tidy Data

And welcome. In this video, I want to walk through the anatomy of a data set. Data come in all shapes and sizes and formats. But I want to walk through what’s called a “tidy” data set, which is essentially a spreadsheet of data. And almost all data that we might analyse is going to need to be in this format before we can process it. So let’s go ahead and take a look at a tidy data set and see how it works. Here’s an example of a tidy data set. I have columns, and I have rows. So what do these represent? Well, in a tidy data set, columns represent things that you have information on.
So for example, I have a column representing how many cups of coffee someone might drink in a day. I have a column representing the preference of coffee type, whether someone likes their coffee black or not, temperature, et cetera, the coffee facts. In addition, rows represent people. So in this case, in a tidy data set, here I have a box around person 4. I can see specific information about person 4. For instance, they tend to consume about 2 cups of coffee per day. Their preference would be a latte. If they’re drinking drip, they don’t like anything in it. And their default temperature is going to be about 176 degrees. So this is how we read a tidy data set.
So just as a quiz, go ahead and pause the video. And I want you to tell me the coffee preference for person 6. So go ahead and pause the video now and see if you can find the number of cups of coffee consumed by person 6. OK, I’m assuming you paused the video. Let’s see how you did it. So for wanting to find the number of cups of coffee for person 6, we’re going to go to the column for cups of coffee and the row for person 6. And we see in the intersection of those two things, we get 1 cup of coffee. That’s really all there is to a tidy data set. Columns represent variables. Rows represent people.
As long as you’ve got data in this format, we can do all the basic statistical analyses.

Lab: Data Set Anatomy

In this lab, you’ll get familiar with the layout of a data set. Interpreting data is an important first step before manipulating it, so there’s no math involved just yet.

The lab instructions can be downloaded as a PDF file here.

The data set for this lab can be viewed here. From the link, copy and paste all the data into a new worksheet in Excel Online.

Lab Check Once you have completed the Data Set Anatomy Lab, answer the following question.

This article is from the free online

Essential Mathematics for Data Analysis in Microsoft Excel

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now