Skip main navigation

How to Use a Spreadsheet to Model a Probability Experiment

This article provides a step-by-step guide on building a spreadsheet to model a probability experiment.
A coin being flipped.

The spreadsheet is built up in seven stages, corresponding to the seven individual sheets in the file. Each one of these builds on the stage before. Each of the stages in building the spreadsheet is described below. You can follow the steps to see how the spreadsheet is built up, or simply make use of the final version (Stage 7).

Stage 1

We start with perhaps the most important feature of the model – the ability to produce convincingly random data. Our spreadsheet contains a single formula in cell A1 at this stage:

=RANDBETWEEN(0,1)

 

In this very simple model, each trial needs to produce either Heads of Tails, each with probability 0.5. The formula that we have entered will randomly choose either 0 or 1 and place the result in A1. This is not quite the ‘Heads or Tails’ that we are looking for, but it is a good start.

 

Check that you can force the spreadsheet to recalculate, and display another random value; Function-F9 will (normally) recalculate an Excel spreadsheet, but if this does not work then entering or changing the value in any unused cell will force recalculation.

 

Stage 2

 

At this stage, we will fix the issue with the spreadsheet generating 0s and 1s instead of Heads and Tails. Note that this is essentially a cosmetic change as we could quite easily work with digits; but it is quite easy and very useful to be able to substitute textual values for numbers, and we will go ahead and do it now.

 

The action here takes place in cell B1, where we have entered the formula:

 

= IF(A1=1, “H”, “T”)

 

The general form of the IF() function is:

 

=IF(logical test, result if test is true, result if test is false)

 

In this case, the formula means:

 

 

    • Look at the value in cell A1. Is it equal to 1? 

 

    • If it is equal to 1, the result is “H”. 

 

    • If the value in A1 is not equal to 1, the result is “T” 

 

 

Try recalculating the spreadsheet a few times now. You should find that as the value in A1 changes between 0 and 1, the value in B1 changes between T and H.

 

Stage 3

 

So far, we have been simulating an experiment with one trial – flipping a single coin. In the question that inspired this simulation, 200 coin flips took place; this stage of the simulation takes this into account.

 

Scroll down the spreadsheet – you will see that the random digits and H/T results go all the way down to row 200. Of course, we did not enter each of the formulae an additional 199 times! Instead we simply selected cells A1 and B1, and filled them down to row 200.

 

 

One way to achieve this in Excel is to drag from cell A1 all the way to cell B200, so that the range A1:B200 is selected; then choose Fill > Down from the Edit menu.
A quicker option is to select cells A1 and B1, and simply drag the square ‘Fill handle’ that appears in the bottom-right of B1 down to B200.

 

 

We now have the 200 random results that we wanted; the next stage is to find a way to analyse the data.

 

Stage 4

 

Our analysis starts by counting the number of Hs in the range B1:B200.

 

To keep things simple, we enter the value that we are looking for (‘H’) in cell C1.

 

Now we enter a formula in cell C2:

 

= COUNTIF(B1:B200, C1)

 

 

    • This formula can be thought of as meaning ‘Count how many cells there are in the range B1 to B200, where the value stored in the cell is the same as that stored in cell C1’

 

 

Again, recalculate the spreadsheet a few times to check that the formula is correctly counting the number of Hs in column B.

 

Stage 5

 

We now need to find the number of Tails. There are a number of ways we could do this, but we have chosen a method that illustrates a powerful feature of spreadsheets – the use of absolute and relative referencing.

 

To see what this means, click on any of the cells in column B. For example cell B9 contains the formula:

 

= IF(A9=1, “H”, “T”)

 

Notice that we never typed this formula anywhere; the formula we entered in cell B1, back in Stage 2, was:

 

= IF(A1=1, “H”, “T”)

 

Our spreadsheet was smart enough to realise that when we copied the formula down the column, what we really meant was something like ‘if the cell one place to the left contains a 1, put a H in this cell; otherwise, put a 0’. This is relative addressing, and it is very often what we want to happen when we copy a formula from one place to another.

 

In this stage (stage 5), we are trying to find the number of Tails in column B. We could enter ’T’ in cell D1, and copy the formula in cell C2 over to D2; but this would not work. The problem is that the range that would be searched for Ts would automatically be changed, thanks to the spreadsheet’s default relative addressing, to C1:C200 when it is copied across one column.

 

Fortunately, we can prevent this by telling the spreadsheet that we always want to inspect values in column B, no matter how far across we copy the formula. To do this, we change the formula in C2 to:

 

= COUNTIF($B1:$B200, C1)

 

The Dollar signs in front of the Bs effectively ‘pin’ the range to column B. We can now copy the formula across to D2. Notice that the copied formula becomes:

 

= COUNTIF($B1:$B200, D1)

 

The pinned Bs have not changed, but C1 has been updated to D1. The formula will look in the same place, but it will be looking for Ts instead of Hs.

 

Stage 6

 

We would like to be able to change the number of trials.

 

We enter the number of trials that we would like to use in cell F2.

 

Next we need to find a way of telling the spreadsheet which range to look for Hs and Ts in, based on the number of trials in the experiment. Cell G2 contains this formula:

 

= “B1:B” & F2

 

Try changing the value in F2. You will see that the contents of cell G2 now indicate the range that the spreadsheet should be looking for Hs and Ts in.

 

Look at the formula in cell C2. This is now:

 

=COUNTIF(INDIRECT($G2), C1)

The INDIRECT() function takes a piece of text and interprets it (if possible) as a reference to a range of cells.In this example, the formula uses the range specified in cell G2 as the range to search for Hs and Ts in.

Try changing the number of trials by varying the value in cell F2.

Stage 7

To complete our spreadsheet model, we make a number of changes:

  • We copy the formulas in columns A and B down to row 1000 so that we can explore larger numbers of trials.
  • We calculate the relative frequencies of Heads and Tails and display them as a pie chart.
  • We do some tidying up and formatting.
This article is from the free online

Teaching Probability

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now