Skip main navigation

Planning your data investigation

At this stage, you are about to perform your own data investigation. In this video, Lovisa Sundin asks Peter Donaldson for advice on data projects.
LOVISA: So Peter, we’re now in week 3 and the time has come to do our own independent investigations. And this can be pretty intimidating. So what would you suggest is our step 1?
PETER: So over the last couple of weeks, we’ve been working with data. So the first thing we need is, we need data, and actually, a reasonably large data set. Before the internet, this was maybe quite difficult to get hold of. But actually, there are a number of options that we’ve got available. So the first thing that we can do is we can go to some of the open data portals that are available. There are some national governments and international organisations that have these. And the second option that we’ve got is, we may actually have some data in our organisation that we can actually analyse and explore to find out a bit more.
And finally, the last option is, with the advent of online survey tools, it’s really easy to actually gather a whole lot of survey data from many people by issuing something like an online survey.
LOVISA: Is there a risk of that leading to, maybe, an overload of information? You might be sitting on 20 different variables in your data set. And what do you do then to get some kind of order in the chaos?
PETER: So I guess one of the things is that you’ll have maybe some questions or initial questions that led you to a particular data set. After that, what you really want to do is you want to carry out some exploratory data analysis. So we’ve looked at a variety of different techniques to visualise the data and to aggregate it in various ways. And the reason that we’ve done that is so that we can identify broader patterns and trends within individual measurements.
And you can think of exploratory data analysis a little bit like a treasure hunt, where you are looking at various aspects, looking for patterns, for anomalies, that might suggest that there is an interesting relationship there between the variables that you have.
LOVISA: And once you have all of these different suggestions of possible patterns, potential trends, is that enough in order to have a clear course of action on how to improve?
PETER: Well, the reason that we’re doing an exploratory data analysis is because, really, as citizen data scientists, we want to advocate for a particular change. And so the last part is to take your exploratory data analysis and to think about which parts of it you’re going to include in your main story and, ultimately, what course of action you’re going to advocate or what conclusions you’re going to reach at the end. Because ideally, you’ve explored questions that you care about and that you’re advocating for some kind of positive change in your local area. That’s really what citizen data science is all about.
LOVISA: Agreed.

Now it’s time to plan your own data science investigation. There are several stages you need to go through.

  1. identify and obtain an interesting data set
  2. define a sensible research question
  3. explore the data to make sense of the question
  4. perform an analysis
  5. generate a report
This article is from the free online

Getting Started with Teaching Data Science in Schools

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now