.na
.ds
One of the most interesting notions in the file
system is the special file.

Certain files do not refer to disk files at
all, but to I/O devices.  By convertion, such special
files reside in a particular
directory, although this is not necessary.
When a special file is read or written, the device it refers
to is activated.
For example, all the communications interfaces attached
to typewriters have special files associated with them.
Thus, provided you have permission, anyone can
send a message to another user simply by
writing information onto his typewriter's special
file.
There are special files, for example, to refer to the paper tape
reader and punch, to the 201 dataphone, the console 
typewriter, and whatever other devices may be on the
system.
An effort is made to make these special files behave
exactly the same way that ordinary disk
files behave.  This means that programs generally
do not need to know whether
they are reading or writing on some device
or on a disk file.

The system calls to to I/O are designed to be very simple
to use as well as efficient.
There is no notion corresponding to GEFRC on the Honeywell machines
and "access methods" in OS because the direct use of
system entries  is so straightforward.

Files are uniformly regarded
as consisting of a stream of bytes;
the system makes no assumptions as to their contents.
Thus the structure of files is controlled solely by the programs
which read and write them.
A file of ASCII text, for example, consists
simply of a stream of characters delimited by the
new-line characters.
The notion of physical record is fairly well submerged.

For example, the system entry to read a file has only
three arguments:
the file which is being read; the location where
the information is to be placed; and the number of bytes
desired.
Likewise the write call need only specify the
file under consideration, th location of the information,
and the number of characters to write.
The system takes care of splitting the read or written
information into physical blocks
as required.

The I/O calls are also apparently synchronous;
that is, for example, when something is written,
so far as the user is concerned, the writing has already
been done.  Actually the system  itself contains buffers
which contain the information, so that the physical
writing may actually be delayed.

There is not distinction between "random" and sequential
I/O.
The read and write calls are sequential in that, for example,
if you read 100 bytes from a file, the next read call
will return bytes starting just after
the last one read.
It is however possible to move the read pointer around
(by means of a "seek" call) so as to read the file
in any order.

I should say that that it is not always
desirable to ignore the fact of physical record sizes.
P rogram which reads one character at a time from a file is 
clearly at a disadvantage compared to one which reads many,
if only because of system overhead.
Thus I/O bound programs are well-advised to read and
write in multiples of the physical
record size (which happens to be uniformly 512 bytes).
But it is efficiency,
not a logical requirement, which dictates this.

PROBLEMS

I mentioned earlier that UNIX was not especially suited
to applications involving vast quantities of data.
The reason is this:  files are limited in size to 64K bytes.
The reason for this is not particularly defensible,
but it has to do with the fact that the PDP-11 word
size is 16 bits.

There are a couple of ways around this problem.
One of them is simply to split one large logical
file  into several smaller actual files.  This approach
works for a while.
The limitation here comes from the fact that
directories are searched in a linear fashion.
Thus if the are a vast number of files, it can become quite
time-consuming tosearch directories to find
the files they contain.  We have not noticed
this to be a problem, so far,
it is only a worry.

Another way around the small file size is
to use a disk as a special file.
For various reasons, when an entire disk
drive is accessed as a special file,
the size limitation does not occur.
Thus one can set up a program which manages its own data--
in effect is its own, special-purpose file system--
and expect reasonable results.

This again bears on the general versus special
purpose system: it probably is more efficient
anyway to do your own data management, provided
the extra labor is worth the cost.

PROCESS CONTROL

As I said, the second part of UNIX is that part
concerned with process control.
A process in UNIX is simply the execution of a program.
Each user has at least one process working on his
behalf: its task is to read his typewriter
and interpret what he types as commands to the system to
do something.
The program associated with this process is called the
Shell, and it has many valuable features, including
the redirection of I/O, so that you can execute programs
which ordinarily write, for example, on the typewriter,
and arrange that their output go on a file.

I will not go into any details, except to say
that either by use of the Shell, or from within
a program, it is possible to create an asynchronously
running process executing any program designated.

SUMMARY

If you are interested in using UNIX, there
are a number of points about which you should be
aware.

First, having to do with the PDP-11 hardware:

the PDP-11, although probably more powerful
that most people realize, is not a large machine:
a PDP-11 can only accommodate 28K 16-bit words
of core.

Moreover, the 11-20 has no hardware protection
features: any user can at any time crash the system 
by executing a program with any of an
infinite variety of bugs.
This fact is probably most important during
program development.

The PDP11/45 essentially solves both of these
problems, in a very cost-effective way--
it is hardly more expensive
than an 11/20 when the total system cost is
considered.
It has hardware segmentation and 256K of core
can be attached.
Since we will be one of the first to get
and 11/45, there will definitely be
a UNIX on it very soon after it arrives.
(however the date is still uncertain.)

Perhaps more important is the fact that
UNIX is essentially a two-man operation at present.
Anyone who contemplates a UNIX installation should
have available some fairly sophisticated
programming talent if any modifications
planned, as they almost certainly will be.
The amount of time that we can spend
working on behalf of, or even advising,
new UNIX users
is limited.
Documentation exists, but never seems
to be complete.

There have been rumblings from certain departments
about taking over the maintenance
of UNIX for the public (i.e., other Labs users)
but I cannot promise anything.
