When a computer is multiprogrammed, it often has multiple processes or threads competing for the CPU simultaneously. This situation happens whenever two or more of them are in the ready state at the same time. If only one CPU is available, a choice has to be made which process to run next. The part of the operating system that makes the choice is called the scheduler and the algorithm it uses is called the scheduling algorithm. These topics form the subject matter of the following sections.

Many of the same problems that apply to process scheduling also apply to thread scheduling, though some are different. When the kernel manages threads, scheduling is generally done per thread, with little or no regard to which process the thread belongs. At first we will focus on scheduling issues that apply to both processes and threads. Later on we will explicitly look at thread scheduling and some of the unique issues it raises. We will deal with multicore chips in "MULTIPLE PROCESSOR SYSTEMS".

Introduction to Scheduling

Back in the old days of batch systems with input in the form of card images on a magnetic tape, the scheduling algorithm was simple: just run the next job on the tape. With multiprogramming systems, the scheduling algorithm became more complicated because there were usually multiple users waiting for service. Some mainframes still combine batch and timesharing service, requiring the scheduler to decide whether a batch job or an interactive user at a terminal should go next. (As an aside, a batch job may be a request to run multiple programs in succession, but for this section, we will just suppose it is a request to run a single program.) Because CPU time is a scarce resource on these machines, a good scheduler can make a big difference in perceived performance and user satisfaction. Therefore, a great deal of work has gone into devising clever and efficient scheduling algorithms.

With the arrival of personal computers, the situation changed in two ways. First, most of the time there is only one active process. A user entering a document on a word processor is unlikely to be simultaneously compiling a program in the background. When the user types a command to the word processor, the scheduler does not have to do much work to figure out which process to run-the word processor is the only candidate.

Second, computers have gotten so much faster over the years that the CPU is rarely a scarce resource any more. Most programs for personal computers are limited by the rate at which the user can present input (by typing or clicking), not by the rate the CPU can process it. Even compilations, a major sink of CPU cycles in the past, take just a few seconds in most cases nowadays. Even when two programs are actually running at once, such as a word processor and a spreadsheet, it hardly matters which goes first since the user is most likely waiting for both of them to finish. Consequently, scheduling does not matter much on simple PCs. Certainly, there are applications that practically eat the CPU alive, for example rendering one hour of high-resolution video while tweaking the colors in each of the 108,000 frames (in NTSC) or 90,000 frames (in PAL) requires industrial-strength computing power. However, similar applications are the exception rather than the rule.

When we turn to networked servers, the situation changes considerably. Here multiple processes frequently do compete for the CPU, so scheduling matters again. For instance, when the CPU has to choose between running a process that gathers the daily statistics and one that serves user requests, the users will be a lot happier if the latter gets first crack at the CPU.

In addition to picking the right process to run, the scheduler also has to worry about making efficient use of the CPU because process switching is expensive. To start with, a switch from user mode to kernel mode must occur. Then the state of the current process must be saved, including storing its registers in the process table so they can be reloaded later. In many systems, the memory map (e.g., memory reference bits in the page table) must be saved as well. Next a new process must be selected by running the scheduling algorithm. After that, the MMU must be reloaded with the memory map of the new process. At last, the new process must be started. In addition to all that, the process switch generally invalidates the whole memory cache, forcing it to be dynamically reloaded from the main memory twice (upon entering the kernel and upon leaving it). All in all, doing too many process switches per second can chew up a large amount of CPU time, so caution is advised.

Process Behavior

Almost all processes alternate bursts of computing with (disk) I/O requests, as shown in Figure 1. Normally the CPU runs for a while without stopping, then a system call is made to read from a file or write to a file. When the system call completes, the CPU computes again until it requires more data or has to write more data, and so on. Note that some I/O activities count as computing. For instance, when the CPU copies bits to a video RAM to update the screen, it is computing, not doing I/O, because the CPU is in use. I/O in this sense is when a process enters the blocked state waiting for an external device to complete its work.

Bursts of CPU usage alternate with periods of waiting for

The important thing to notice about Figure 1 is that some processes, such as the one in Figure 1(a), spend most of their time computing, while others, such as the one in Figure 1(b), spend most of their time waiting for I/O. The former are called compute-bound; the latter are called I/O-bound. Compute-bound processes usually have long CPU bursts and thus infrequent I/O waits, whereas I/O-bound processes have short CPU bursts and thus frequent I/O waits. Note that the key factor is the length of the CPU burst, not the length of the I/O burst. I/O-bound processes are I/O bound because they do not compute much between I/O requests, not because they have especially long I/O requests. It takes the same time to issue the hardware request to read a disk block no matter how much or how little time it takes to process the data after they arrive.

It is worth noting that as CPUs get faster, processes tend to get more I/O-bound. This effect occurs because CPUs are improving much faster than disks. Consequently, the scheduling of I/O-bound processes is likely to become a more important subject in the future. The main idea here is that if an I/O-bound process wants to run, it should get a chance quickly so that it can issue its disk request and keep the disk busy. As we saw in "Modeling Multiprogramming" Figure, when processes are I/O bound, it takes quite a few of them to keep the CPU fully occupied.


threads, scheduler, main memory, kernel, memory map, system call