On a computer, a variable corresponds to some piece of information that we need to store in memory. In the traffic model, for example, we need to store all the cells in the old (containing the state from the previous step) and new roads (containing the current state of the road), and other quantities that we calculate such as the number of cars that move or the density of the cars. All of these are variables — they take different values throughout the calculation.
Remember that the the shared memory architecture (many CPU-cores connected to the same piece of memory) is like several office mates sharing a whiteboard. In this model, we have two choices as to where we store any variables:
- shared variables: accessible by everyone in the office
- private variables: can only be accessed by the person who owns them
A shared variable corresponds to writing the value on the whiteboard so that everyone in the office can read or modify it. You can think of private variables being stored on a personal notepad that can only be seen by the owner.
Although writing everything on the whiteboard for all to see might seem like a good idea, it is important to ensure that the officemates do not interfere with each other’s calculations. If you are working on the cells for a section of road, you do not want someone else changing the values without you knowing about it. It is crucial to divide up the work so that the individual tasks are independent of each other (if possible) and to make sure that workers coordinate whenever there is a chance that they might interfere with each other.
In the shared-variables model, the workers are often referred to as threads.
Things to consider
When parallelising a calculation in the shared-variables model, the most important questions are:
- which variables are shared (stored on the whiteboard) and which are private (written in your own notepad);
- how to divide up the calculation between workers;
- how to ensure that, when workers need to coordinate with each other, they do so correctly;
- how to minimise the number of times workers must coordinate with each other.
The most basic methods of coordination are:
- master region: certain calculations are only carried out by one of the workers - a nominated boss worker;
- barrier: everybody waits until all workers have reached a certain point in the calculation; when everyone has reached that point, workers can then proceed;
- locking: if you are working with a variable and don’t want anyone else to touch it, you can lock it. This means that only one worker can access the variable at a time - if the variable is locked by someone else, you have to wait until they unlock it. On a shared whiteboard you could imagine circling a variable to show to everyone else that you have it locked, then erasing the circle when you are finished.
Clearly, all of these have the potential to slow things down as they can lead to workers waiting around for others to finish, so you should try and do as little coordination as possible (while still ensuring that you get the correct result!).
Adding to a Variable
One of our basic operations is to increment a variable, for example to add up the total number of cars that move each iteration. It may not be obvious but, on a computer, adding one to a variable does not comprise a single operation. Using the whiteboard analogy, it has the following stages:
- take a copy of the value on the whiteboard and write it in your notepad (load a value from memory into register);
- add one to the value on your notepad (issue an increment instruction on the register);
- copy the new value back to the whiteboard (store the new value from register to memory).
In the shared-variables model, the problem occurs if two or more workers try and do this at the same time: if one worker takes a copy of the variable while another worker is modifying it on their notepad, then you will not get the correct answer. Sometimes you might be lucky and no-one else modifies the variable while you are working on your notepad, but there is no guarantee.
This situation is called a race condition and is a disaster for parallel programming: sometimes you get the right answer, but sometimes the wrong answer. To fix this you need to coordinate the actions of the workers, for example using locking as described above.
Can you think of an example of race condition in your everyday life? What strategies do we use to prevent them?
Share and discuss your ideas with your fellow learners!
© EPCC at The University of Edinburgh