How do we design good experiments?
While we may have access to lots of existing data, it does not necessarily mean that it will answer the questions we want to ask.
Sometimes we will need to go out and collect more data. This can be expensive and time-consuming. This is why it’s important to make sure you have the right experimental design in place before you get started; to make sure you collect the right data and enough of it to answer your questions and avoid some of the possible problems outlined in previous steps.
There are some key stages in the experiment design process:
Have one or more clear and explicit hypotheses generated from a good understanding of the problem. This will help enormously in scoping your design and in turning stakeholders’ questions into testable experiments.
Think about what you are comparing. This is often to assess the effect of a proposed decision or improvement, so, for example, the baseline or current situation may be compared with the improvement. Additionally, consider what else might affect the comparison. Could, for example, the time and place the data is collected or the people in two groups you are comparing differ in important ways? You will need to factor this in.
Think about the best way to collect the data you need. This may be available from enterprise analytics systems or you may need to conduct a survey or design a customised interface to collect it.
Ensure that your sampling – the data you do collect – is representative of the groups you want to say something about.
Think about how much data you will need to be in a position to differentiate any effect from randomness/chance. For example, if a coin flip gives us heads five times in a row, does it mean the coin is biased or is this just normal randomness?
A note on randomness. Randomness is really interesting. In practice, it often behaves in unexpected ways and probably explains a lot of findings that we mistake to be due to our own action or skill. As humans, we have evolved with senses that are tuned to recognising patterns quickly (eg seeing a poisonous snake in the grass), but this means that we also can be fooled into seeing patterns that are not there (ie it’s just a random patch of grass that looks a bit like a snake).
Spiegalhalter, D. (2019). The art of statistics. Pelican.
© Coventry University. CC BY-NC 4.0