Want to keep learning?

This content is taken from the The University of Glasgow's online course, Getting Started with Teaching Data Science in Schools. Join the course to learn more.

Skip to 0 minutes and 5 seconds LOVISA: So Peter, we’re now in week 3 and the time has come to do our own independent investigations. And this can be pretty intimidating. So what would you suggest is our step 1?

Skip to 0 minutes and 17 seconds PETER: So over the last couple of weeks, we’ve been working with data. So the first thing we need is, we need data, and actually, a reasonably large data set. Before the internet, this was maybe quite difficult to get hold of. But actually, there are a number of options that we’ve got available. So the first thing that we can do is we can go to some of the open data portals that are available. There are some national governments and international organisations that have these. And the second option that we’ve got is, we may actually have some data in our organisation that we can actually analyse and explore to find out a bit more.

Skip to 0 minutes and 54 seconds And finally, the last option is, with the advent of online survey tools, it’s really easy to actually gather a whole lot of survey data from many people by issuing something like an online survey.

Skip to 1 minute and 7 seconds LOVISA: Is there a risk of that leading to, maybe, an overload of information? You might be sitting on 20 different variables in your data set. And what do you do then to get some kind of order in the chaos?

Skip to 1 minute and 20 seconds PETER: So I guess one of the things is that you’ll have maybe some questions or initial questions that led you to a particular data set. After that, what you really want to do is you want to carry out some exploratory data analysis. So we’ve looked at a variety of different techniques to visualise the data and to aggregate it in various ways. And the reason that we’ve done that is so that we can identify broader patterns and trends within individual measurements.

Skip to 1 minute and 52 seconds And you can think of exploratory data analysis a little bit like a treasure hunt, where you are looking at various aspects, looking for patterns, for anomalies, that might suggest that there is an interesting relationship there between the variables that you have.

Skip to 2 minutes and 10 seconds LOVISA: And once you have all of these different suggestions of possible patterns, potential trends, is that enough in order to have a clear course of action on how to improve?

Skip to 2 minutes and 22 seconds PETER: Well, the reason that we’re doing an exploratory data analysis is because, really, as citizen data scientists, we want to advocate for a particular change. And so the last part is to take your exploratory data analysis and to think about which parts of it you’re going to include in your main story and, ultimately, what course of action you’re going to advocate or what conclusions you’re going to reach at the end. Because ideally, you’ve explored questions that you care about and that you’re advocating for some kind of positive change in your local area. That’s really what citizen data science is all about.

Skip to 3 minutes and 4 seconds LOVISA: Agreed.

Planning your data investigation

Now it’s time to plan your own data science investigation. There are several stages you need to go through.

  1. identify and obtain an interesting data set
  2. define a sensible research question
  3. explore the data to make sense of the question
  4. perform an analysis
  5. generate a report

Share this video:

This video is from the free online course:

Getting Started with Teaching Data Science in Schools

The University of Glasgow