Want to keep learning?

This content is taken from the Coventry University's online course, Get ready for a Masters in Data Science and AI. Join the course to learn more.

Sense checking sensational statistics

Thinking about the ‘scientist’ part of the data scientist’s skill set, we might say that a good scientist is: objective, independent and curious. Scientists also, however, limit their confidence about the world to what the evidence supports and are prepared to be proven wrong or to be undecided.

Statistics are a great tool in the data scientists’ toolbox, but even before learning and applying its methods, we can develop some important habits of mind - common sense, clear thinking and the willingness to dig deeper.

It’s really useful to start developing these habits, or rather skills, by looking at data in the world around us. In the news and media every day we see messages backed by science that promote changes in behaviour and habits, from which shampoo to buy, to political and social policies.

We can develop our critical statistical thinking by taking a closer look at the ‘science’ behind such messages and researching the data they are based on. We don’t need to be expert statisticians to get a feeling for how well the evidence supports the claims being made, as there are some key things to look out for:

  • Interpretation. Is the ‘high-level’ interpretation of the evidence (such as a news headline or summary) consistent with what the evidence itself is showing? For instance, does the research tentatively suggest a small effect, which is blown up to be a conclusive fact in the interpretation?

  • Representativeness. Has the data collection method (survey or sample) ensured that the result of the research is representative of the overall population that it applies to? Is the sample size large enough? Generally speaking, the minimal sample size is 100, but it really depends on the quality/reliability of the data source.

  • Alternative explanations. Could there be another way to account for the data that was observed/collected by the researchers but which they don’t seem to have factored in?

  • Replication. Has the effect been reproduced in different experiments and studies? Is the claim based on just a single study with a particular setup and location?

  • Independence. Has the research been carried out or funded by people with a vested interest in a positive or negative result?

Further reading

Bock, T. (n.d.). How to calculate minimum sample size for a survey or experiment. Displayr. https://www.displayr.com/calculate-minimum-sample-size-survey-experiment/

Statistical Literacy. (n.d.) http://www.statlit.org/


Bisits Bullen, P. (2013, October 18). How to choose a sample size (for the statistically challenged). Tools4Devs. http://www.tools4dev.org/resources/how-to-choose-a-sample-size/

Share this article:

This article is from the free online course:

Get ready for a Masters in Data Science and AI

Coventry University