Data Science goes to school
Data science is covered in parts of the school curriculum in the UK, but it is generally scattered across several subjects. Basic statistics is covered as part of the Maths curriculum.
There are potential issues with statistics coverage in schools at the moment. For instance, let’s look at the National 5 Maths curriculum for Scotland (broadly equivalent to GCSE level in England).
There are mechanical calculations of values on tiny data sets. These include measures of central tendancy like:
- arithmetic mean (traditional average i.e. sum divided by count)
- mode (most frequently occurring value)
- median (middle value when the data is sorted in order)
There are also measures of spread like:
- range (max - min value)
- interquartile range (range of central 50% of values when data is sorted in order)
- variance (more complex measure of spread, an average distance from mean value)
For instance, here is a typical statistics question from a written exam paper:
The midday temperatures in Grantford were recorded over a nine day period. The temperatures, in degrees C, were:
4 7 4 3 6 10 9 5 3
Calculate the median and interquartile range for these temperatures.
If you like, you can post your answers in the discussion section!
Students are also expected to be able to draw and interpret simple graphics including a histogram and a cumulative frequency diagram.
School learners often do not have opportunity to engage in higher level thinking. They do not draw meaningful conclusions from large scale datasets. This is partly because they are not trained to use computational tools to allow analysis of large, real-world data sets.
To carry out good data science, learners still need an understanding of the basic concepts, but we can use code to compute the summary values and draw relevant graphs. Then the learner can employ higher-level skills to perform analysis and interpretation.
However, the UK Data Service argues that the single most important factor in engaging learners with data science activities is the actual data sets that we investigate. The data needs to be relevant to students and the world around them, providing valuable insight for contemporary issues in society.
© University of Glasgow