Message Passing

Message Passing

That something else is message passing. This method of interprocess communication uses two primitives, send and receive, which, like semaphores and unlike monitors, are system calls rather than language constructs. As such, they can easily be put into library procedures, such as

send(destination, &message);
and
receive(source, &message);

The former call sends a message to a given destination and the latter one receives a message from a given source (or from ANY, if the  receiver does not care). If no message is available, the receiver can block until one arrives. On the other hand, it can return instantly with an  error code.

Design Issues for Message-Passing Systems

Message passing systems have many challenging problems and design issues that do not occur with semaphores or with monitors,  particularly if the communicating processes are on different machines connected by a network. For instance, messages can be lost by the network. To guard against lost messages, sender and receiver can agree that as soon as a message has  been received, the receiver will send back a special acknowledgement message. If the sender has not received the acknowledgement  within a certain time interval, it retransmits the message.

A solution to the producer-consumer problem in Java

Now imagine what happens if the message is received properly, but the acknowledgement back to the sender is lost. The sender will  retransmit the message, so the receiver will get it twice. It is necessary that the receiver be able to distinguish a new message from the  retransmission of an old one. Generally, this problem is solved by putting consecutive sequence numbers in each original message. If the  receiver gets a message bearing the same sequence number as the previous message, it knows that the message is a duplicate that can  be ignored. Successfully communicating in the face of undependable message passing is a major part of the study of computer networks. For more information, see (Tanenbaum, 1996).

Message systems also have to deal with the question of how processes are named, so that the process specified in a send or receive call is  clear. Authentication is also an issue in message systems: how can the client tell that it is communicating with the real file server, and not with an imposter?

On the other hand, there are also design issues that are important when the sender and receiver are on the same machine. One of these is performance. Copying messages from one process to another is always slower than doing a semaphore operation or entering a monitor. Much work has gone into making message passing efficient. Cheriton (1984), for instance, proposed limiting message size to what will fit in the machine's registers, and then doing message passing using the registers.

The Producer-Consumer Problem with Message Passing

Let us examine how the producer-consumer problem can be solved with message passing and no shared memory. A solution is given in Figure 2. We consider that all messages are the same size and that messages sent but not yet received are buffered automatically by the  operating system. In this solution, a total of N messages is used, similar to the N slots in a shared-memory buffer. The consumer starts  out by sending N empty messages to the producer. Whenever the producer has an item to give to the consumer, it takes an empty  message and sends back a full one. Thus, the total number of messages in the system remains constant in time, so they can be  stored in a given amount of memory known in advance.

If the producer works faster than the consumer, all the messages will end up full, waiting for the consumer; the producer will be blocked,  waiting for an empty to come back. If the consumer works faster, then the reverse happens: all the messages will be empties waiting for the  producer to fill them up; the consumer will be blocked, waiting for a full message.

The producer-consumer problem with N messages

A lot of variants are possible with message passing. For beginners, let us consider how messages are addressed. One way is to allocate each process a unique address and have messages be addressed to processes. An other way is to invent a new data structure, called a mailbox. A mailbox is a place to buffer a certain number of messages, normally specified when the mailbox is created. When mailboxes are used, the address parameters in  the send and receive calls are mailboxes, not processes. When a process tries to send to a mailbox that is full, it is suspended until a message is deleted from that mailbox, making room for a new one.

For the producer-consumer issue, both the producer and consumer would create mailboxes large enough to hold N messages. The  producer would send messages containing actual data to the consumer's mailbox, and the consumer would send empty messages to the  producer's mailbox. When mailboxes are used, the buffering mechanism is clear: the destination mailbox holds messages that have been  sent to the destination process but have not yet been accepted.

The other extreme from having mailboxes is to eradicate all buffering. When this approach is followed, if the send is done before the receive,  the sending process is blocked until the receive happens, at which time the message can be copied directly from the sender to the receiver,  with no intermediate buffering. Likewise, if the receive is done first, the receiver is blocked until a send happens. This strategy is frequently known  as a rendezvous. It is easier to implement than a buffered message scheme but is less flexible since the sender and receiver are forced to run in lockstep.

Message passing is commonly used in parallel programming systems. One well-known message-passing system, for instance, is MPI  (Message-Passing Interface). It is extensively used for scientific computing. For more information about it, see for instance (Gropp et al.,  1994; and Snir et al., 1996).



Tags

interprocess communication, system calls, shared memory, semaphores, monitors