The second major element in any computer is the memory. Preferably, a memory should be very fast (faster than executing an instruction so the CPU is not held up by  the memory), abundantly large, and dirt cheap. No current technology satisfies all of these goals, so a different approach is taken. The memory system is built as a  hierarchy of layers, as shown in the following figure. The top layers have higher speed, smaller capacity, and greater cost per bit than the lower ones, often by factors of a billion or more.


The top layer comprises the registers internal to the CPU. They are made of the same material as the CPU and are thus just as fast as the CPU. As a result, there is no  delay in accessing them. The storage capacity available in them is typically 32 x 32-bits on a 32-bit CPU and 64 x 64-bits on a 64-bit CPU. Less than 1 KB in both cases. Programs must manage the registers (i.e., decide what to keep in them) themselves, in software.

Next comes the cache memory, which is generally managed by the hardware. Main memory is divided up into cache lines, typically 64 bytes, with addresses 0 to 63 in cache line 0, addresses 64 to 127 in cache line 1 , and so on. The most heavily used cache lines are kept in a high-speed cache located inside or very close to the CPU. When the program requires to read a memory word, the cache hardware checks to see if the line required is in the cache. If it is, called a cache hit, the request is satisfied from the cache and no memory request is sent over the bus to the main memory. Cache hits generally take about two clock cycles. Cache misses have to go to memory, with a considerable time penalty. Cache memory is limited in size due to its high cost. Some machines have two or even three levels of cache, each one slower and bigger than the one before it.

Caching plays a main role in various areas of computer science, not just caching lines of RAM. Whenever there is a large resource that can be divided into pieces, some of which are used much more heavily than others, caching is often invoked to get better performance. Operating systems use it all the time. For instance, most operating systems  keep (pieces of) heavily used files in main memory to stay away from having to bring them from the disk repeatedly. Likewise, the results of converting long path names like

/home/ast/projects/min ix3/src/kernel/clock. c

into the disk address where the file is located can be cached to keep away from frequent lookups. At last, when an address of a Web page (URL) is converted to a network address (IP address), the result can be cached for future use. Many other uses exist.

In any caching system, a number of questions come up quite soon, including:

1  When to put a new item into the cache.
2. Which cache line to put the new item in.
3. Which item to remove from the cache when a slot is needed.
4. Where to put a newly evicted item in the larger memory.

Not every question is related to every caching condition. For caching lines of main memory in the CPU cache, a new item will normally be entered on every cache miss. The cache line to use is usually calculated by using some of the high-order bits of the memory address referenced. For instance, with 4096 cache lines of 64 bytes and 32 bit  addresses, bits 6 through 17 might be used to specify the cache line, with bits 0 to 5 the byte within the cache line. In this case, the item to delete is the same one as the new data goes into, but in other systems it might not be. Lastly, when a cache line is rewritten to main memory (if it has been customized since it was cached), the place in memory to rewrite it to is distinctively determined by the address in question.

Caches are such a good idea that modern CPU's have two of them. The first level or L1 cache is always inside the CPU and generally feeds decoded instructions into the CPUs execution engine. The majority of chips have a second L1 cache for very heavily used data words. The L1 caches are typically 16 KB each. Moreover, there is often a second cache, called the L2 cache, that holds several megabytes of recently used memory words. The difference between the L1 and L2 caches lies in the timing. Access to the L1 cache is done quickly, whereas access to the L2 cache involves a delay of one or two clock cycles.

On multicore chips, the designers have to decide where to place the caches. In "Multithreaded and Multicore Chips" figure (a), there is a single L2 cache shared by all the cores. This approach is used in Intel multi core chips. On the contrary, in "Multithreaded and Multicore Chips" figure (b), each core has its own L2 cache. This approach is used by AMD. Each approach has its pros and cons. For instance, the Intel shared L2 cache requires a more complex cache controller but the AMD way makes keeping the L2 caches consistent more difficult.

Main memory comes next in the hierarchy of above figure. This is the workhorse of the memory system. Main memory is generally called RAM (Random Access Memory). Old-timers sometimes call it core memory, because computers in the 1950s and 1960s used tiny magnetizable ferrite cores for main memory. At present, memories are hundreds of megabytes to several gigabytes and growing speedily. All CPU requests that cannot be satisfied out of the cache go to main memory.

In addition to the main memory, a lot of computers have a small amount of nonvolatile random access memory. Unlike RAM, nonvolatile memory does not lose its contents when the power is switched off. ROM (Read Only Memory) is programmed at the factory and cannot be altered afterward. It is fast and inexpensive. On some computers, the bootstrap loader used to start the computer is contained in ROM. Also, some I/0 cards come with ROM for handling low-level device control.

EEPROM (Electrically Erasable PROM) and flash memory are also nonvolatile, but in contrast to ROM can be removed and rewritten. Nevertheless, writing them takes orders of magnitude more time than writing RAM, so they are used in the same way ROM is, only with the additional feature that it is now possible to correct bugs in programs they hold by rewriting them in the field.

Flash memory is also frequently used as the storage medium in portable electronic devices. It serves as film in digital cameras and as the disk in portable music players, to name just two uses. Flash memory is intermediate in speed between RAM and disk. Also, unlike disk memory, if it is removed too many times, it wears out.

Yet another kind of memory is CMOS, which is volatile. Many computers use CMOS memory to hold the current time and date. The CMOS memory and the clock circuit that increases the time in it are powered by a small battery, so the time is correctly updated, even when the computer is unplugged. The CMOS memory can also hold the configuration parameters, such as which disk to boot from. CMOS is used because it draws so little power that the original factory installed battery often lasts for many years. Though, when it begins to fail, the computer can appear to have Alzheimer's disease, forgetting things that it has known for years, like which hard disk to boot from.


main memory, cache memory, core memory