The UNIX V7 File System

The UNIX V7 File System

The early versions of UNIX had a fairly sophisticated multiuser file system since it was derived from MULTICS. Below we will talk about the V7 file system, the one for the PDP-11 that made UNIX famous. The file system is in the form of a tree starting at the root directory, with the addition of links, forming a directed acyclic graph. File names are up to 14 characters and can contain any ASCII characters except / (because that is the separator between components in a path) and NUL (because that is used to pad out names shorter than 14 characters). NUL has the numerical value of 0.

A UNIX directory entry includes one entry for each file in that directory. Each entry is very simple because UNIX uses the i-node scheme shown in "FILE SYSTEM IMPLEMENTATION" Figure 5. A directory entry includes only two fields: the file name (14 bytes) and the number of the i-node for that file (2 bytes), as illustrated in Figure 1. These parameters limit the number of files per file system to 64K.

Like the i-node of "FILE SYSTEM IMPLEMENTATION" Figure 5, the UNIX i-nodes contains some attributes. The attributes contain the file size, three times (creation, last access, and last modification), owner, group, protection information, and a count of the number of directory entries that point to the i-node. The latter field is needed due to links. Whenever a new link is made to an i-node, the count in the i-node is increased. When a link is removed, the count is decremented. When it gets to 0, the i-node is reclaimed and the disk blocks are put back in the free list.

Keeping track of disk blocks is done using a generalization of "FILE SYSTEM IMPLEMENTATION" Figure 5 in order to handle very large files. The first 10 disk addresses are stored in the i-node

A UNIX V7 directory entry

itself, so for small files, all the necessary information is right in the i-node, which is fetched from disk to main memory when the file is opened. For somewhat larger files, one of the addresses in the i-node is the address of a disk block called a single indirect block. This block contains additional disk addresses. If this still is not enough, another address in the i-node, called a double indirect block, contains the address of a block that contains a list of single indirect blocks. Each of these single indirect blocks points to a few hundred data blocks. If even this is not enough, a triple indirect block can also be used. The complete picture is given in Figure 2.

A UNIX i-node

When a file is opened, the file system must take the file name supplied and locate its disk blocks. Let us consider how the path name/usr/ast/mbox is looked up. We will use UNIX as an example, but the algorithm is basically the same for all hierarchical directory systems. First the file system locates the root directory. In UNIX its i-node is located at a fixed place on the disk. From this i-node, it locates the root directory, which can be anywhere on the disk, but say block 1 .

Then it reads the root directory and looks up the first component of the path, usr, in the root directory to find the i-node number of the file /usr.  Locating an i-node from its number is straightforward, since each one has a fixed location on the disk. From this i-node, the system locates the directory for /usr and looks up the next component, ast, in it. When it has found the entry for ast, it has the i-node for the directory /usr/ast. From this i-node it can find the directory itself and look up mbox. The i-node for this file is then read into memory and kept there until the file is closed. The lookup process is illustrated in Figure 3.

The steps in looking up

Relative path names are looked up the same way as absolute ones, only starting from the working directory instead of starting from the root directory. Every directory has entries for .  and  ..  which are put there when the directory is created. The entry .  has the i-node number for the current directory,  and the entry for  ..  has the i-node number for the parent directory. Thus, a procedure looking up ../dick/prog.c simply looks up  ..  in the working directory, finds the i-node number for the parent directory, and searches that directory for dick. No special mechanism is needed to handle these names. As far as the directory system is concerned, they are just ordinary ASCII strings, just the same as any other names. The only bit of trickery here is that  ..  in the root directory points to itself.

Tags

v7 file system, single indirect block, double indirect block, triple indirect block