When the semaphore's ability to count is not required, a simplified version of the semaphore, called a mutex, is often used. Mutexes are good  only for managing mutual exclusion to some shared resource or piece of code. They are easy and efficient to implement, which makes them especially useful in thread packages that are implemented completely in user space.

A mutex is a variable that can be in one of two states: unlocked or locked. As a result, only 1 bit is needed to represent it, but in practice an integer frequently is used, with 0 meaning unlocked and all other values meaning locked. Two methods are used with mutexes. When a thread (or process) needs access to a critical region, it calls mutex_lock. If the mutex is currently unlocked (meaning that the critical region is available), the call succeeds and the calling thread is free to enter the critical region.

On the other hand, if the mutex is already locked, the calling thread is blocked until the thread in the critical region is ended and calls mutex_unlock. If numerous threads are blocked on the mutex, one of them is chosen at random and permitted to obtain the lock. Because mutexes are so simple, they can easily be implemented in user space provided that a TSL or XCHG instruction is available. The code for mutex_lock and mutex_unlock for use with a user-level threads package are shown in following Figure 1. The solution with XCHG is basically the same.

Implementation of mutex

The code of mutex_lock is analogous to the code of enter _region of "Peterson's Solution" Figure2  but with a crucial difference. When enter _region fails to enter the critical region, it keeps testing the lock repeatedly (busy waiting). Finally, the clock runs out and some other process is scheduled to run. Sooner or later the process holding the lock gets to run and releases it. With (user) threads, the situation is different because there is no clock that stops threads that have run too long. Consequently, a thread that tries to acquire a lock by busy waiting will loop forever and never obtain the lock because it never allows any other thread to run and release the lock.

That is where the difference between enter _region and mutex_lock comes in. When the later fails to obtain a lock, it calls thread_yield to give up the CPU to another thread. As a result there is no busy waiting. When the thread runs the next time, it tests the lock again. Since thread_yield is just a call to the thread scheduler in user space, it is very fast. As a result, neither mutex_lock nor  mutex_unlock needs any kernel calls. Using them, user-level threads can synchronize completely in user space using procedures that require only a handful of instructions.

The mutex system that we have explained above is a bare-bones set of calls. With all software, there is always a demand for more features, and synchronization primitives are no exception. For instance, sometimes a thread package offers a call mutex_trylock that either obtains the lock or returns a code for failure, but does not block. This call gives the thread the flexibility to decide what to do next if there are alternatives to just waiting.

Up until now there is a subtle issue that we have glossed over lightly but which is worth at least making explicit. With a user-space threads package there is no problem with multiple threads having access to the same mutex, since all the threads operate in a common address space. However, with most of the earlier solutions, such as Peterson's algorithm and semaphores, there is an unspoken assumption that multiple processes have access to at least some shared memory, perhaps only one word, but something. If processes have disjoint address spaces, as we have consistently said, how can they share the turn variable in Peterson's algorithm, or semaphores or a common buffer?

There are two answers. First, some of the shared data structures, such as the semaphores, can be stored in the kernel and only accessed via system calls. This approach eliminates the problem. Second, most modern operating systems (including UNIX and Windows) offer a way for processes to share some portion of their address space with other processes. In this way, buffers and other data structures can be shared. In the worst case, that nothing else is possible, a shared file can be used.

If two or more processes share most or all of their address spaces, the distinction between processes and threads becomes somewhat blurred but is nevertheless present. Two processes that share a common address space still have different open files, alarm timers, and other per-process properties, whereas the threads within a single process share them. And it is always true that multiple processes sharing a common address space never have the efficiency of user-level threads since the kernel is deeply involved in their management.

Mutexes in Pthreads

Pthreads provides a number of functions that can be used to synchronize threads. The basic mechanism uses a mutex variable, which can be locked or unlocked, to guard each critical region. A thread wishing to enter a critical region first tries to lock the associated mutex. If the mutex is unlocked, the thread can enter immediately and the lock is atomically set, preventing other threads from entering. If the mutex is already locked, the calling thread is blocked until it is unlocked. If multiple threads are waiting on the same mutex, when it is unlocked, only one of them is allowed to continue and relock it. These locks are not mandatory. It is up to the programmer to make sure threads use them correctly.

The major calls relating to mutexes are shown in following Figure 2. As expected, they can be created and destroyed. The calls for performing these operations are pthread_mutex_init and pthread_mutex_destroy, respectively. They can also be locked - by pthread_mutex_lock - which tries to acquire the lock and blocks if is already locked. There is also an option for trying to lock a mutex and failing with an error code instead of blocking if it is already blocked. This call is pthread_mutex_trylock. This call allows a thread to effectively do busy waiting if that is ever needed. Finally, Pthread_mutex_unlock unlocks a mutex and releases exactly one thread if one or more are waiting on it. Mutexes can also have attributes, but these are used only for specialized purposes.

Some of the Pthreads calls relating to mutexes

In addition to mutexes, pthreads offers a second synchronization mechanism: condition variables. Mutexes are good for allowing or blocking access to a critical region. Condition variables allow threads to block due to some condition not being met. Almost always the two methods are used together. Let us now look at the interaction of threads, mutexes, and condition variables in a bit more detail.

As a simple example, consider the producer-consumer scenario again: one thread puts things in a buffer and another one takes them out. If the producer discovers that there are no more free slots available in the buffer, it has to block until one becomes available. Mutexes make it possible to do the check atomically without interference from other threads, but having discovered that the buffer is full, the producer needs a way to block and be awakened later. This is what condition variables allow.

Some of the calls related to condition variables are shown in following Figure 3. As you would probably expect, there are calls to create and destroy condition variables. They can have attributes and there are various calls for managing them (not shown). The primary operations on condition variables are pthread_cond_wait and pthread_cond_signal. The former blocks the calling thread until some other thread signals it (using the latter call). The reasons for blocking and waiting are not part of the waiting and signaling protocol, of course. The blocking thread often is waiting for the signaling thread to do some work, release some resource, or perform some other activity. Only then can the blocking thread continue. The condition variables allow this waiting and blocking to be done atomically. The pthread_cond_broadcast call is used when there are multiple threads potentially all blocked and waiting for the same signal.

Some of the Pthreads calls relating to condition variables

Condition variables and mutexes are always used together. The pattern is for one thread to lock a mutex, then wait on a conditional variable when it cannot get what it needs. Eventually another thread will signal it and it can continue. The pthread_cond_ wait call atomically and atomically unlocks the mutex it is holding. For this reason, the mutex is one of the parameters.

It is also worth noting that condition variables (unlike semaphores) have no memory. If a signal is sent to a condition variable on which no thread is waiting, the signal is lost. Programmers have to be careful not to lose signals.

As an example of how mutexes and condition variables are used, "Monitors" Figure 1 shows a very simple producer-consumer problem with a single buffer. When the producer has filled the buffer, it must wait until the consumer empties it before producing the next item. Similarly, when the consumer has removed an item, it must wait until the producer has produced another one. While very simple, this example illustrates the basic mechanisms. The statement that puts a thread to sleep should always check the condition to make sure it is satisfied before continuing, as the thread might have been awakened due to a UNIX signal or some other reason.


mutex, condition variables, kernel calls, semaphores, shared memory, system calls