When a page fault takes place, the operating system has to choose a page to remove (remove from memory) to make room for the incoming page. If the page to be removed has been customized while in memory, it must be rewritten to the disk to bring the disk copy up to date. If, on the other hand, the page has not been modified (e.g., it includes program text), the disk copy is already up to date, so no rewrite is required. The page to be read in just overwrites the page being removed.

While it would be possible to pick a random page to remove at each page fault, system performance is much better if a page that is not heavily used is chosen. If a heavily used page is removed, it will probably have to be brought back in quickly, resulting in extra overhead. Much work has been done on the subject of page replacement algorithms, both theoretical and experimental. Below we will explain some of the most important algorithms.

It is worth noting that the problem of "page replacement" takes place in other areas of computer design as well. For instance, most computers have one or more memory caches consisting of recently used 32-byte or 64-byte memory blocks. When the cache is full, some block has to be chosen for removal. This problem is exactly the same as page replacement except on a shorter time scale (it has to be done in a few nanoseconds, not milliseconds as with page replacement). The reason for the shorter time scale is that cache block misses are satisfied from main memory, which has no seek time and no rotational latency.

Another example is in a Web server. The server can keep a certain number of heavily used Web pages in its memory cache. Nonetheless, when the memory cache is full and a new page is referenced, a decision has to be made which Web page to remove. The considerations are similar to pages of virtual memory, except for the fact that the Web pages are never customized in the cache, so there is always a fresh copy "on disk". In a virtual memory system, pages in main memory may be either clean or dirty.

In all the page replacement algorithms to be studied below, a certain issue takes place: when a page is to be evicted from memory, does it have to be one of the faulting process own pages, or can it be a page belonging to another process? In the former case, we are effectively limiting each process to a fixed number of pages; in the latter case we are not. Both are possibilities. We will come back to this point in "DESIGN ISSUES FOR PAGING SYSTEMS".

The Optimal Page Replacement Algorithm

The best possible page replacement algorithm is easy to explain but impossible to implement. It goes like this. At the moment that a page fault takes place, some set of pages is in memory. One of these pages will be referenced on the very next instruction (the page containing that instruction). Other pages may not be referenced until 10, 100, or perhaps 1000 instructions later. Each page can be labeled with the number of instructions that will be executed before that page is first referenced.

The optimal page replacement algorithm says that the page with the highest label should be removed. If one page will not be used for 8 million instructions and another page will not be used for 6 million instructions, removing the former pushes the page fault that will fetch it back as far into the future as possible. Computers, like people, try to put off unpleasant events for as long as they can.

The only problem with this algorithm is that it is unrealizable. At the time of the page fault, the operating system has no way of knowing when each of the pages will be referenced next. (We saw a similar situation earlier with the shortest job first scheduling algorithm - how can the system tell which job is shortest?) Still, by running a program on a simulator and keeping track of all page references, it is possible to implement optimal page replacement on the second run by using the page reference information collected during the first run.

Thus it is possible to compare the performance of realizable algorithms with the best possible one. If an operating system achieves a performance of, say, only 1% worse than the optimal algorithm, effort spent in looking for a better algorithm will yield at most a 1% improvement.

To avoid any possible confusion, it should be made clear that this log of page references refers only to the one program just measured and then with only one particular input. The page replacement algorithm derived from it is thus particular to that one program and input data.  Although this method is useful for evaluating page replacement algorithms, it is of no use in practical systems. In next section we will study algorithms that are useful on real systems.


operating system, algorithm, virtual memory