.tr |
.na
.ls 3
.fo ''- % -''
.ul
File System Data Structure
.br
	Files may be stored on any of several kinds of
secondary storage device.
The only requirement such a device must satisfy before
being supported
by the UNIX file system is that it be directly addressable
in blocks of 256 words (512 bytes).
That is, the storage space on the device must be
divisible into a number of 256-word blocks,
any of which may be read and rewritten as often
as required.
All of the disk storage systems supplied by Digital Equipment Corporation
for use with the PDP-11 have this characteristic,
for example the RF11/RS11 and RF11/RC11 fixed-head disk systems and the RK11/RK03
and RP11/RP02
moving-head disk systems;
so also does the TC11 DECtape system.
An industry-standard magnetic tape system could not be used
for file system storage, however,
because it is impossible for the usual tape controller to rewrite
individual records.
	Several devices used for file system storage may be attached to UNIX
at one time; they need not all be of the same device type.
Moreover, it is possible to interchange
the disk packs on certain disk drives and the
the tape reels on DECtape drives.
Each fixed head disk, each disk pack, and each tape reel
is referred to as a
file
system
.ul
volume.
All file system volumes have a common format, which is illustrated
in FIG. X.
	The blocks on each volume are divided into
three portions:
the
.ul
super-block,
the
.ul
i-list,
and
.ul
allocatable space.
	The super-block consists of two physical 512-byte blocks,
or 1024 bytes in all.
The contents of the super-block summarize the
allocation of space on the file system volume.
A copy of the super-block for each mounted
device resides in core memory at all times.
It is this core copy that is consulted
by the file system routines.
Periodically, and in any event
when the volume
is to be dismounted, the disk copy of the
super block is updated from the core
copy.
	The super-block is divided into two portions.
The first records the allocation of blocks
used for file storage.
As seen from FIG. X, there is one word,
the
.ul
free map size,
located at
the beginning of this region
which indicates the total number
of blocks which are potentially allocatable
to files.
The contents of this word are never changed by
the file system; it is established by
a stand-alone disk initializing
program.
The free map size word is followed immediately
by the
.ul
free map
itself.
For each block
in the volume, the free map
contains one bit indicating whether
that block is currently part of a file
or free to be allocated to a new or growing
file.
Since are 8 bits in each byte, there
the number of bytes in the free map
is  n_/8 , where n_
is the free map size.
In the free map, a "1" bit indicates that the corresponding block
is allocated.
Since the blocks in super-block and the i-list (see below)
are never allocated to files,
the free map bits corresponding to these
blocks are permanently set to "1" to
prevent
allocation of these blocks.
	The second portion of the super-block
contains a word giving the size of the
.ul
i-list
and a free map for the i-list.
The useage and structure of the i-list are
discussed in detail below.
Fundamentally, however,
each file on the volume has an entry in the i-list,
called its
.ul
i-node.
The i-list size word indicates the maximum
number of files which can exist on the volume.
This size word is followed
by an i-list free map.
Each bit in the i-list free map
indicates whether the associated i-list entry (i-node)
actually corresponds to a file or is available
for allocation to a file being created.
	Since there is one bit in the i-list free map
for each slot in the i-node table,
the i-list free map consumes  n_/8  bytes,
where  n_  is the i-list map size.
	It is not required that the free maps for
block allocation and i-node allocation and
their associated size words
exhaust the super-block.
Any extra space is simply ignored.
In general, the free map size and the i-list
size are chosen to
maximize the useful space on the volume.
	Referring once again to FIG. X, the second
major portion of
a file system volume consists of the i-list itself.
As indicated above, to each file on the volume
there corresponds an i-list entry, or i-node.
In fact, it is more accurate to say that an i-node
defines a file; for example the i-node contains
the file's addressing parameters, protection information,
and other vital statistics.
	The i-list resides entirely on the
volume, although individual i-nodes are
copied into core memory
during the accessing of a file.
The i-list begins at block 2 of the volume,
just after the super-block.
Each i-node is 32 bytes long; thus
the total number of bytes in the i-list is
|32\u.\dn_ ,
where  n_  is the i-list size word,
and the number of blocks in the i-list is  (32\u.\dn_)/512|
since there are 512 bytes per block.

Following the i-list on the volume are the blocks
which may be allocated to files.
The number of blocks in this region can be calculated
by subtracting the 2 blocks
in the super-block and the blocks in the i-list from the free map
size word.
	It is not required that the
blocks described in the volume super-block
exhaust the blocks actually available on the volume;
any blocks beyond the limit implied by
the size words in the super-block are simply ignored
by the file system routines.
.ul
Structure of I-nodes
	It was indicated above that to each file on a volume
corresponds an i-node, that is, an i-list entry.
FIG. X illustrates the structure and contents of an i-node.

The first byte contains three flag bits,
illustrated in FIG. X:
	an
.ul
allocated
bit, if "1", indicates that the i-node corresponds
to an allocated file.
This bit should duplicate
the corresponding bit in the i-node free map
table of the super block.
It is included in the i-node so
as to guard against duplicate allocation of an i-node;
in particular, when a file is being created,
if the i-node free-map bit indicates that
a given i-node may be used for the file,
but the i-node allocated-bit
is "1", that i-node is rejected
from consideration and another is tried.
	A
.ul
directory
bit indicates whether the file associated
with the i-node is a directory file or an ordinary file.
Directories are discussed fully below.
	A
.ul
large
bit indicates whether the file is "large" or "small";
"small" means
smaller than 4096 bytes.
Accessing of small files differs
from that of large files in a manner discussed fully
below.
	Returning to FIG. X, the next byte of an i-node
contains several bits giving the protection mode
of the associated file.
FIG. X illustrates the layout of the
bits in the protection byte.
	One bit grants permission to a non-owner
of the file to write (change) the file.
(The owner of a file is defined below)
	One bit grants permission to a non-owner of
the file to read the file.
When this bit is "0", the file is "private"
since only its owner may examine its contents.
	One bit grants permission to the owner of the file
to write (change) the file.
It is useful to protect against accidental destruction
of the file.
	One bit grants permission to the
owner of the file to read the file.
It is rare for this bit to be "0", and fact it is provided
principally for symmetry with the non-owner-read bit.
	One bit grants permission to execute the file as a program.
Read permission is also required to execute the file.
	A last bit indicates that if the file is executed
as a program, then during the execution of the
program the effective user identification of the
invoker of the program will become that of the owner of
the file.
This facility is discussed at greater length below.
	Returning to FIG. X, a byte
in the i-node of a file contain the user identification
of the owner of the file.
Each user of the system is assigned a user number.
Whenever a user creates a file,
his user ID number is placed in this field if the
i-node of the file.
If, say, a user wished
to read a file, and his user ID is the same as the user-ID field
if the file7s i-node, then he is deemed the
owner of the file, otherwise he is a non-owner
for purposes of granting permission.
	Another byte if an i-node contains
the number of
.ul
links
to the file.
The number of links may be considered equivalent to the number of names
a file possesses, or to the number of times its i-node number
appears in a directory.
Directories are discussed at length below.
	Two bytes (one word) in each i-node contain
the size of the associated file, measured in bytes.
The size of a file is determined by the offset of
the last byte written into it.
Conversely, the size of a file determines the point at which end-of-file
occurs when reading the file.
	Eight words (16 bytes) of each i-node are devoted
to containing addressing information for the associated file.
The interpretation of these words depends on whether
the file is "large" or "small", as indicated by the "large"
bit in the flag word;
this bit in turn is determined by whether the size of the file is
smaller than 4096 bytes.
	If the file is small,
the 8 address words contain the block addresses (i.e. numbers)
of up to 8 blocks which store the information in the file.
Refer to FIG. X;
in this example the i-node for the file contains
three blocks, whose numbers are stored in i-node block
addresses 0, 1, and 2.
The associated blocks contain the files contents.
In this example, the file must be smaller than
512\u.\d3 bytes long; in general, since there are 8 slots
for addresses, a small file can contain only
8\u.\d512, or 4096 bytes.
	Refer to FIG. X,
which illustrates a large file.
For a large file, the block address words
of an i-node entry point to
.ul
indirect blocks
rather than to the file contents themselves.
Thus the first i-node block
address contains the block address of an indirect block,
which in turn contains several block addresses
for the file contents proper.
Since there are 256 words (512 bytes)
in a block, and since a block address occupies
one word (two bytes),
each indirect block contains 256 file
contents-block addresses.
In accordance with this scheme, therefore,
files may be as large as 8\u.\d256\u.\d512 (1,048,576) bytes long,
since there are 8 indirect block address words per i-node,
256 contents block addresses
per indirect block, and 512 bytes per block.
In the illustrative embodiment, however,
it has been convenient to restrict files
to 65,536 bytes,
a size which can be represented within the 16-bit (2-byte, or 1-word)
size field of the inode.
	Return to FIG. X,
which illustrates the structure of an i-node.
Four bytes are set aside to record the
creation time of the file, and four more bytes to
record the time of last modification for the file
associated with the i-node.
These times may be inspected by the
owner of the file, but are not otherwise used by
the file system.

.ul
Directories
.br
It should be clear at this point that
given the i-node
(or
.ul
i-number)
of a file,
one may access any byte of the file for for reading or writing.
The i-list is located in
known block addresses of the file system volume, and
i-nodes are of constant length, therefore, the
i-node for the file may be easily calculated.
Since the i-node contains the block addresses
of the contents of the file, either directly
or indirectly, it is straightforward
to calculate the block within which a given byte falls,
and thus to read or replace
the specified byte.
	Nevertheless, the file system scheme
as described so far is incomplete,
simply because it is extremely inconvenient to
refer to a file by an artificial number instead
of by name.
This section discusses the construction and interpretation
of names in the file system.
	A
.ul
directory
is a file containing the names of other files.
The files named in a directory may be ordinary
files or may themselves
be directories.
One particular directory is distinguished as the
.ul
root directory.
The root directory is known to the file system.
A file can be named by giving
the name of the file preceded by a sequence
of directory names, such that each name is contained
in the directory which is its predecessor.
By convention, the names are separated by the "/" character
and begins with a "/" to indicate the root.
Refer to FIG. X.
In this drawing, the root directory is shown on the
left; it contains the entries "DIR1", "DIR2", "FILE1",
and "FILE2",
as well as "." and "..", which will be discussed below.
DIR1 and DIR2 are the names of directories; FILE1 and FILE2
are ordinary files.
DIR1 contains the two file entries A and B;
DIR2 contains entries DIRA, FILEA, and FILEB.
DIRA is a directory and contains A.
The full name of the FILE1 entry in the root directory
is "/FILE1"; that of the A entry in the DIR1 directory
is "/DIR1/A".
On the other hand, the full name of the A entry in the FILEA
directory is "/DIR2/DIRA/A".
The latter example indicates that
entry names need not be distinct from
all other entry names,
although it is required that the entry
names in any single directory be distinct.
The name of the root directory itself can be given
as "/".
	Each user of the UNIX time-sharing system of which
the file system is a part is assigned a
.ul
current working directory,
which may be any directory accessable by the
user.
A name which does not begin with a "/" is interpreted
with respect to the working directory instead
of the root directory.
Thus, if the working directory is /DIR2,
entry A in its subdirectory DIRA be be referred to as
DIRA/A;
FILEA may simply be called FILEA.
	It is apparent that the directories on a file system
volume have the form of a tree, in which
the root directory is the root of the tree.
Each  directory (except for the root)
has a unique parent directory,
namely the one in which
its entry name is recorded.
To aid in traversing this tree,
two special entries in each directory are always present,
namely "." and "..".
"." refers to the directory in which it appears.
Thus, the name "." means the current working directory.
The entry ".." refers to the directory 
which is the parent of the directory in which it appears.
Thus if the current working directory is /DIR2/DIRA,
the name .. means the same as /DIR2, and ../FILEA
means the same as /DIR2/FILEA.
The root directory has no parent, and therefore
its .. entry is made to point to itself.
	In most ways a directory is treated
the same as any other file;
directories are however distinguished
for protection purposes, in that user's programs
are never allowed to write directories;
therefore their format
is controlled by the system.
	FIG. X illustrates the format
of a directory.
A directory consists of a number of independent
entries, each 10 bytes long.
The first two bytes (one word) are interpreted
as the i-number of a file.
The final eight bytes contain the name of the
file.
Entries which have been deleted contain a "0"
in the i-number field.
