Skip main navigation

One sided communication

How can we take advantage of the parallelisation of MPI is based on the distributed memory? We will learn to do that in this subsection.

As we have already learnt in the beginning the parallelisation in MPI is based on the distributed memory. This means that if we run a program on different cores, each core has its own private memory. Since the memory is private to each process, we send messages to exchange data from one process to another.

In two-sided (i.e. point to point communication) and collective communication models the problem is that (even with the non blocking) both sender and receiver have to participate in data exchange operations explicitly, which requires synchronization.

In this example we can see when we have the non blocking routine but the problem is that when we call the MPI_Send and until the message has been received by the MPI_Recv, there is this time in which both the processes have to wait and they can not do anything. Therefore a significant drawback of this approach is that the sender has to wait for the receiver to be ready to receive the data before it can send the data, or vice versa. This causes idle time. To avoid this we use one sided communication.

Although MPI is using a distributed memory approach, the MPI standard introduced Remote Memory Access (RMA) routines also called one-sided communication because it requires only one process to transfer data. Simply put, it enables a process to access some data from the memory of the other processes. The idea is that a process can have direct access to the memory address space of a remote process without intervention of that remote process.

So we do not have to explicitly call the send and receive routines from both processes involved in the communication. One process can just put and get the data from the memory of another process. This is helpful because the target process can continue executing its tasks, doing its work without waiting for anything. So the most important benefit of one sided communication is that while a process puts or gets data from remote process, the remote process can continue to compute instead of waiting for the data. This reduces communication time and can resolve some problems with scalability of the programs (i.e. on thousands of MPI processes).

In order to allow other processes to have access into its memory, a process has to explicitly expose its own memory to others. This means that for the origin process to access the memory in the target process, the target process has to allow that the memory can be accessed and used. It does this by declaring a shared memory region, also called a window. This window becomes the region in the memory that is available to all the other processes in the communicator allowing them to put and get data from its memory. This window is created by calling the function

MPI_Win_create (void *base, MPI_Aint size, int disp_unit, MPI_Info info, MPI_Comm comm, MPI_Win *win);

The arguments in this function are quite different. They are as follows:

  • base is the pointer to local data to expose, i.e., the data we would want access to.
  • size denotes the size of local data in bytes.
  • disp_unit is the unit size displacements.
  • info is the information argument. Most often we use info_NULL.
  • comm is the communicator that we know from all the previous functions.
  • win represents the window object.

And at the end of the MPI application we have to free this window with the function

MPI_Win_free (MPI_Win *win);

So with these functions we create a window around the memory that would be accessible to others. That is why at the end we have to call this Win_free function to free this window.

To better understand let’s go through a classic example.

MPI_Win win;
int shared_buffer[NUM_ELEMENTS];
MPI_Win_create(shared_buffer, NUM_ELEMENT, sizeof(int), MPI_INFO_NULL, MPI_COMM_WORLD, &win);
...
MPI_Win_free(&win);

So here we define an MPI struct variable win. Then we define some data or storage through either dynamic allocation or something similar. Using this buffer we actually then create the window. So in the MPI_Win_create you can see that we would like to share this shared_buffer buffer. The size here is NUM_ELEMENTS. Since each data type is int, the displacement becomes, let’s say probably 4 bytes wide. The information argument is usually NULL and the communicator as always is the MPI_COMM_WORLD. Once this is called, this shared buffer can be shared by all the processes by calling the MPI_Put and MPI_Get routines. Of course, at the end of the application we free the win window.

MPI_Put and MPI_Get

To access the data we need the two routines we talked about earlier, the MPI_Put and MPI_Get. The MPI_Put operation is equivalent to a send by origin process and a matching receive by the target process. Let’s look at the prototype of these functions which have quite many arguments.

MPI_Put (void *origin_addr, int origin_count, MPI_Datatype origin_datatype, int target_rank, MPI_Aint target_disp, int target_count, MPI_Datatype_target_datatype, MPI_Win win)

In the same manner MPI_Get is similar to the put operation, except that data is transferred from the target memory to the origin process. The prototype of this function looks like this

MPI_Get (void *origin_addr, int origin_count, MPI_Datatype origin_datatype, int target_rank, MPI_Aint target_disp, int target_count, MPI_Datatype_target_datatype, MPI_Win win);

We will understand in depth about the arguments of these functions in the following exercise. But before we get into that, another important thing that we need to discuss is the synchronization. If you remember we discussed this concept briefly in the second week when we were learning about the concepts of OpenMP. In one sided communication in MPI, the target process calls the function to create the window in order to give access of its memory to other processes. However, in the case of multiple users it is already quite plain to see that if these users try to simultaneously access this data, that can already lead to some problems. For example, let’s say two users access the window to put data using the MPI_Put function. This is clearly a race condition that needs to be avoided. This is where synchronization comes into play. So, in order to avoid this before and after each one sided communication function, i.e., MPI_Get and MPI_Put, we need to use the function

MPI_Win_fence (0, MPI_Win win);

This function actually helps us to synchronize the data in a way that if multiple processes would like to access the same window it makes sure that they go in an order. So, the program will allow different processes to access the window but it will ensure that it is not happening at the same time. So, it is important that the one-sided function calls are surrounded by this fence function.

This article is from the free online

Introduction to Parallel Programming

Created by
FutureLearn - Learning For Life

Reach your personal and professional goals

Unlock access to hundreds of expert online courses and degrees from top universities and educators to gain accredited qualifications and professional CV-building certificates.

Join over 18 million learners to launch, switch or build upon your career, all at your own pace, across a wide range of topic areas.

Start Learning now