Skip main navigation

Margin of error

What can we say about the error we make by using the relative frequency as an estimate of the probability?

We have seen that the observed frequencies are subject to change from sample to sample, due to variability and randomness of data (real or simulated). Nevertheless, for large samples, the relative frequency appears to get close to a deterministic number, which we interpret as the probability of the event of interest.

But there is a question: how close?

In other words, what can we say about the error we make by using the relative frequency as an estimate of the probability? How confident are we when making a prediction based on a sample? Is there any guarantee that the error is sufficiently small?

Thankfully, it turns out that the amount of variability in the results from one random sample to another random sample is quite predictable. This enables statisticians to quantify the likely error that may occur in estimation based on a random sample.

Example (Gallup’s poll on offshore drilling)

Source: Agresti, A., Franklin, C., Klingenberg, B. 2023. Statistics: The Art and Science of Learning from Data, Pearson. p. 34.

In 2011, Gallup’s annual environmental survey reported that 60% of Americans favoured offshore drilling as a means to reduce the US dependence on foreign oil (with a further 37% opposed to offshore drilling and the remaining 3% having no firm opinion). The poll was based on interviews conducted with a random sample of  1,021 adults aged 18 and older, living in the continental United States, and selected “at random.” The reported figure of 60% was derived from the percentage observed in the sample, but Gallup claimed that, with high confidence, this estimate also applied to the entire US population.

To address sampling variability causing a possible error of such a prediction, the Gallup report said that the margin of error was “plus or minus 3%.” The reading of this statement is that the predicted percentage of Americans supporting offshore drilling was very likely to be somewhere between 60 − 3 = 57% and 60 + 3 = 63%. The phrase “very likely” suggests a guarantee of high confidence in a prediction – usually about 95%, which means that such statements are correct about 95 times out of 100. This range of plausible values is referred to as a 95% confidence interval for the estimated proportion. To summarise, the margin of error is an important concept:

Definition (margin of error)

The margin of error (MOE) is a measure of variability of an estimate from sample to sample. It is used to express the accuracy of an estimate, i.e to assess how close we expect it to fall to the true value.

But how can we quantitatively evaluate the margin of error?

To be specific, suppose we are trying to estimate a centrality parameter (small mu) using the sample mean (small bar{x}) from a sample of size (small n). Then, MOE is a bound such that the inequality (small |bar{x}-mu|le{})MOE holds with high confidence (e.g 95%).

It is natural to expect that the margin of error is closely related to the standard deviation (small s_x)​ of the sample: if (small s_x)​ is large then the margin of error is likely to be large, and vice versa.

It is also clear that the margin of error would be larger if we wanted a higher level of confidence, e.g 99% instead of 95%.

What is remarkable and less intuitive, is that the margin of error is inversely proportional to the square root of the sample size.

The practical rule to evaluate the margin of error of the estimator is analogous to the Empirical Rule discussed in Shape of data distribution:

The margin of error to guarantee a (P%) confidence in prediction is given by: 

(small displaystyle mathsf{MOE} = frac{s_x}{sqrt{n}} ,  small ,(P% = 68%))

(small displaystyle mathsf{MOE} = frac{2:!s_x}{sqrt{n}} ,  small (P% = 95%))

(small displaystyle mathsf{MOE} = frac{3:!s_x}{sqrt{n}} ,  small ;!(P%approx 100%))

Specialised to the case of estimating a proportion (probability) (small p)using the relative frequency (small hat{p})​, it can be shown that the sample variance is estimated as 

(small fbox{$,displaystyle s_x^2 = hat{p}left(1-hat{p}right),$})

Thus, in this situation, the MOE rule reads as follows:

If the proportion (probability) (small p) is estimated by the relative frequency (small hat{p})​ observed in a sample of size (small n), then the margin of error to guarantee a (P%) confidence is given by:

(small displaystyle mathsf{MOE} = sqrt{frac{hat{p},(1-hat{p})}{n}} , small ,(P% = 68%))

(small displaystyle mathsf{MOE} = 2,sqrt{frac{hat{p},(1-hat{p})}{n}} , small (P% = 95%))

(small displaystyle mathsf{MOE} = 3,sqrt{frac{hat{p},(1-hat{p})}{n}} , small ;!(P%approx 100%))

Applying the latter rule to the Gallup report at a 95% confidence, we obtain

phat <- 0.6
n <- 1021 
MOE <- 2*sqrt(phat*(1-phat)/n) 
print(MOE) 
## 0.03066357 

We see that our rule has reproduced a 3% MOE reported by Gallup.

For comparison, MOE with a nearly 100% confidence is given by 4.6%:

## 0.04599536 

Thus, we can be almost certain that the estimated proportion falls within 55.4% to 64.6%.

As an exercise, you can now calculate the MOE for our simulations in the previous sections.

Next steps

This activity prepared you for the exercises you will complete this week in RStudio. In the next activity, you will work in RStudio to carry out simulated random experiments to observe the stability of frequencies of various events of interest, such as the frequency of 6 in dice rolls, or series of heads in flipping a fair coin.

This article is from the free online

Statistical Methods

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now