Skip main navigation

£199.99 £139.99 for one year of Unlimited learning. Offer ends on 28 February 2023 at 23:59 (UTC). T&Cs apply

Find out more

Tidy Data

And welcome. In this video, I want to walk through the anatomy of a data set. Data come in all shapes and sizes and formats. But I want to walk through what’s called a “tidy” data set, which is essentially a spreadsheet of data. And almost all data that we might analyse is going to need to be in this format before we can process it. So let’s go ahead and take a look at a tidy data set and see how it works. Here’s an example of a tidy data set. I have columns, and I have rows. So what do these represent? Well, in a tidy data set, columns represent things that you have information on.
So for example, I have a column representing how many cups of coffee someone might drink in a day. I have a column representing the preference of coffee type, whether someone likes their coffee black or not, temperature, et cetera, the coffee facts. In addition, rows represent people. So in this case, in a tidy data set, here I have a box around person 4. I can see specific information about person 4. For instance, they tend to consume about 2 cups of coffee per day. Their preference would be a latte. If they’re drinking drip, they don’t like anything in it. And their default temperature is going to be about 176 degrees. So this is how we read a tidy data set.
So just as a quiz, go ahead and pause the video. And I want you to tell me the coffee preference for person 6. So go ahead and pause the video now and see if you can find the number of cups of coffee consumed by person 6. OK, I’m assuming you paused the video. Let’s see how you did it. So for wanting to find the number of cups of coffee for person 6, we’re going to go to the column for cups of coffee and the row for person 6. And we see in the intersection of those two things, we get 1 cup of coffee. That’s really all there is to a tidy data set. Columns represent variables. Rows represent people.
As long as you’ve got data in this format, we can do all the basic statistical analyses.

Lab: Data Set Anatomy

In this lab, you’ll get familiar with the layout of a data set. Interpreting data is an important first step before manipulating it, so there’s no math involved just yet.

The lab instructions can be downloaded as a PDF file here.

The data set for this lab can be viewed here. From the link, copy and paste all the data into a new worksheet in Excel Online.

Lab Check Once you have completed the Data Set Anatomy Lab, answer the following question.

This article is from the free online

Essential Mathematics for Data Analysis in Microsoft Excel

Created by
FutureLearn - Learning For Life

Our purpose is to transform access to education.

We offer a diverse selection of courses from leading universities and cultural institutions from around the world. These are delivered one step at a time, and are accessible on mobile, tablet and desktop, so you can fit learning around your life.

We believe learning should be an enjoyable, social experience, so our courses offer the opportunity to discuss what you’re learning with others as you go, helping you make fresh discoveries and form new ideas.
You can unlock new opportunities with unlimited access to hundreds of online short courses for a year by subscribing to our Unlimited package. Build your knowledge with top universities and organisations.

Learn more about how FutureLearn is transforming access to education