£199.99 £139.99 for one year of Unlimited learning. Offer ends on 28 February 2023 at 23:59 (UTC). T&Cs apply

# Scatter and Gather

In this subsection we will learn how to share different data amongst the processes through scatter and gather functions.

### Scatter

As we saw in the broadcast function, the root process sends the same data to every other process. However, sometimes in many applications, we might have some data, that we would like, as the word says, to scatter among other processes. This denotes that we need to divide the data into equal parts so each process has the equal part to receive, meaning each process in our communicator will just get a fraction of it. So, this is the main difference between scatter and broadcast. We will see through the exercises further on where this would be useful.

This is the function prototype

MPI_Scatter (void *sendbuf, int sendcount, MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Dataype recvtype, int root, MPI_Comm comm)

The function prototype is similar to broadcast, but we will go through the arguments because there are some parts we need to be careful with. As usual first we have to specify the data, so this is the buffer. In this example, the root processor will be the one with rank 1 that would like to scatter this array of five numbers to all the other processes. To be able to do this, it will need to specify this sendbuf. Following that is the number sendcount and a bit later we will see a recvcount. Usually, they are the same. This is actually the number that tells you how many elements will be sent to each process and it is important to note that it does not mean how many elements are sent in total, but only the fraction that each process will get. The next argument is the recvbuf, that is the buffer of the the process that will receive the data. Finally, root is the same as in broadcast. It is the process that actually does the scattering and comm indicates the communicator in which the processes reside. The only thing we need to be careful with in this function is the sendcount and recvcount because this is the number that dictates how many elements will be sent to each process and not the number of whole elements. Another important thing to be noted is that when this function is finished, the sender (in our example the process with rank 1) will not get the information of the whole data. In our example, this would mean that after communication rank 1 will have only a part of the data, i.e., B.

Image courtesy: Rolf Rabenseifner (HLRS)

The difference between MPI_Bcast and MPI_Scatter is that while MPI_Bcast sends the same piece of data to all processes whereas MPI_Scatter sends chunks of data to different processes.

### Gather

After the data or the information is scattered, quite obviously, the information would need to be, as this function suggests, gathered. Gather is the inverse of Scatter. The gather function quite literally gathers all the information back to the original root process. As we will see the basic idea in many MPI applications is that we have some data, we scatter it, so that every process computes something and then we gather back the information together in one process. The function is quite similar to MPI_Scatter

MPI_Gather(void *sendbuf, int sendcount, MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Dataype recvtype, int root, MPI_Comm comm)

The main difference here is that since only one process, i.e., the root gathers all the information it is the only one that needs to have a valid receive buffer. All other calling processes can pass NULL for recvbuf since they do not receive anything as they just send the data to the root process. Finally, once again to be noted and remembered is that the recvcount parameter is the count of elements received per process and not the total summation of counts from all processes!

Image courtesy: Rolf Rabenseifner (HLRS)