PRIMER
                                  on
                         FILE TRANSFER METHODS

                                  by

                             Terry Smythe
                           Sysop, Z-Node 40

INTRODUCTION
 
From time to time, many of us have had a need to transfer data from our
own micro computer to another micro computer.  The 2 computers might be
close together, or they may be across town or across the nation.  It's
when they are not close together that problems occur because the
telephone system becomes the link between the two computers, rather
than a short piece of shielded wire.

This "voice" telephone network is not "clean", and is a major source of
aggravation and heartburn as it far too often adds extraneous
characters commonly referred to as "line noise", to legitimate data as
it moves through the wire.  The net effect is a document, file,
program, etc., that is often made virtually unusable by the involuntary
insertion of these additional characters.

The problem has its origin with the historical development, around the
turn of the century, of our nationwide telephone network.  Basically,
solid copper connected telephones across town, across the nation,
across the continent, and it had many mechanically actuated switches
scattered along the way.  These switches and many line connections
eventually wear out, provoking a progressive deterioration of network
efficiency.

Many of these switches and much of the wire passes near other
electrical equipment throwing off magnetic fields which adds to
unreliability because of the principle of induced current.  A magnetic
field need only be brought near copper wire to "induce" a foreign
current in it, in addition to legitimate data also passing through it
at the same time.

Until the late '40's, early 50's, there was no real concern because the
network carried only "voice" signals.  The combination of the human
ear, coupled to an discerning brain, had the wonderful natural ability
to filter intelligent conversation.  What remained was simply thought
of as "static", and largely ignored.


CURRENT SITUATION
 
These "voice" telephone circuits are still a major highway over which a
huge amount of computerized data now flows.  Unfortunately, computers
still lack the smarts inherent within a typical human brain, and have
great difficulty sorting out the good from the bad in what is passing
through the wire.  The ability to ignore the junk along the way is
still an elusive goal.

True, networks have been made much more reliable in recent years, with
the development and implementation of microwave networks, fibre optics,
electronic switches, etc.  Each in their own way contributing in some
significant way to reducing the amount of "noise" passing through the
network.

Today, the network is no longer solid copper wire.  It is more likely a
mix of solid copper wire, microwave transmission, and fibre optics.
However, even in a contemporary system there are still elements of the
old copper network with its increasingly unreliable mechanical
switches, deteriorating connections, etc.

The bottom line is that network quality is really a function of the
weakest link between 2 points.  If your computer is connected into a
computerized electronic switch you will enjoy relatively clean data
transmission if you are connected to another computer which itself is
connected into a computer switch, possibly the same one in the
telephone company's local switching centre.

However, if one of the computers is still hooked into an older
mechanically switched centre, significant line noise is inevitable.
Curiously, it can make itself known at one end, and not the other;
where one user can be looking at an incredibly scrambled screen
display, while the other user is looking at a spotless screen display,
wondering what's going on.

The net result here is a high probability that when data is sent
through the wire in an uncontrolled single trip fashion, it will pick
up line noise somewhere along the way.  In the case of conventional
ASCII data (plain, readable text), the presence of a few extraneous
characters is of no great consequence, relatively easy to detect, and
fix with your favorite word processor or full screen editor.

However, in the case of computer programs, the presence of a single
extraneous character can make it totally unuseable.  Detection is
almost impossible, certainly for the neophyte, and patching the fix is
not a trivial task.  A considerable array of computer smarts is needed
here, something that most users neither have nor want.

 
ERROR DETECTION
 
In the eyes (brain?) of the computer, characters introduced by line
noise are every bit as legitimate as live data passing through
simultaneously.  It would take extraordinary programming talent to
develop a communications system capable of sorting out the good from
the bad as it streams by.  The task is mind boggling, such that it
simply is not done.

However, error detection methods have been developed, are in place now,
and are quite effective and reliable.  Fundamentally, they are all
based upon the perception that if errors have been introduced enroute,
then the file as received will be different from the file as
transmitted.  They do not really care what the specific error is, only
that an error of some kind has been detected.

Most of these error detection methods are based upon some simple
arithmetic formulae applied to the same file at both ends, and the
results compared.  If the file as received has the same result as the
file when sent, then it's reasonable to assume that the file has been
transferred correctly.  If different, then the file must be sent
through again, and again, until it does come through the wire clean and
correct.

Doing such a calculation on an entire file is very inefficient. You
really should not have to find out after an hour's transmission that
errors have crept in.  At this rate, it could take days to send a large
file through the wire accurately.

The simple fix is to break up the file into small blocks, typically 128
bytes long.  This way, only those blocks where an error has been
detected, need be re-transmitted again.  So, a file of say 1500 blocks
might take about an hour to transmit cleanly.  Even on a noisy line, a
maximum number of bad blocks likely would not exceed 30-50.  In this
way, that portion of the file needing to be re-transmitted is reduced
to a manageable level.

 
ERROR DETECTION METHODS
 
In August 1977, Ward Christensen, a pioneer in data communications,
developed a method of file transmission with simultaneous error
detection.  He simply called it MODEM2 (Release 2.0), but very quickly
it became affectionately known as the "Christensen" protocol.

In its simplest form, this original, somewhat primitive error detection
scheme added up the values of all characters in the 128 byte data
stream, and sent this value through the wire.  The receiver meanwhile
was adding up the values of the characters as they arrived, and
compared the result with the "CHECKSUM" value sent through by the
sender.

If these 2 numbers did not agree, the receiver sent through a code
telling the sender to repeat the transmission of that bad block.  This
process was repeated, if necessary, up to 10 times for a particular bad
block.  Only when the 2 numbers were identical, did the receiver send
through a code acknowledging correct block received.  The sender would
then move on to the next block of 128 characters, repeating the process
all over again.

This early method of error detection was deliberately made super-
simple, so that it could apply to a whole host of different machines,
under an almost infinite array of data transmission conditions.
However, because of its simplicity, it did let a few technically
obscure errors sneak through.  Consequently, Ward Christensen and Chuck
Forsberg collaborated in the development, and release in 1982, of the
CRC (Cylic Redundancy Checking) error detection scheme which has
remained in widespread use to this day.

Because it guarantees a minimum level of error detection confidence of
not less than 99.9969%, CRC is accepted as a reliable method of
ensuring clean and accurate file transfer.  Most systems of file
transfer now employ CRC, or a derivative of it, as their principal
method of error detection.  Please note this is error detection, not
error correction, a function still best left to human intelligence.

Uncertain how or when, but this protocol became universally known as
XMODEM.  The original CheckSum method was never abandoned, and to
distinguish between them, they are universally known as:

     XMODEM     - CheckSum protocol

     XMODEM CRC - CRC protocol

Where the CheckSum method simply added up the values of the characters
in a 128 character block, the CRC method does sequential division on
each character in the block, resulting in a significant improvement in
error detection.  Looks something like this:

                    Discard Quotient
          _____________________________________
Constant )    Character
 
          Constant x Quotient
          --------------------------------
          Remainder + next character

             Constant x Quotient
             ---------------------------------
             Remainder + next character

                 etc

                 Constant x Quotient
                 ----------------------------------
                  Remainder       <-- CRC Value

     Note: Constant = (X+1)(X15+X+1)

When there are no more characters for sequential division, the final
remainder is the CRC value sent through by the sender.  The receiver
applies the same calculation to the incoming characters, and compares
the results with the incoming CRC value.   If equal, the block is
acknowledged and the next block is allowed to come through.  Inequality
would require re-transmission of the block, to a maximum of 10 times.
If still unequal after 10 tries, the transmission will be automatically
terminated.

 
ENHANCED FILE TRANSFER METHODS
 
With normal equipment upgrades, such as microwave and fibre optics,
telephone companies around the world have progressively improved their
abilities to transfer data more reliably over voice grade lines.  As
line quality improves, line "noise" decreases, and data files may be
successfully transferred with fewer "hits".   In fact, it is
commonplace today to experience file transfers with no "hits" at all.

This improvement in data transmission capability provoked a realization
that the 128 character block size had become inefficient because of its
associated overhead.  Furthermore, new methods of data transmission,
such as DATAPAC, resulted in dramatically inefficient use of the
telephone network. (e.g. a DATAPAC "packet" capable of carrying 1024
characters was carrying only 128 characters!)

To overcome this inefficiency, Chuck Forsberg developed the YMODEM
protocol, where the block size was increased to 1024 characters.  In
it, he inserted a rather nifty feature where the protocol would
automatically step down to 128 character block size if line noise got
so bad as to degrade elapsed file transmission time.  This auto step-
down has been universally adopted at 3 consecutive "hits" (bad blocks).

The YMODEM protocol has only a modest improvement in elapsed file
transmission time over the conventional voice network.  However, it
provided a dramatic improvement on the DATAPAC network by simply using
the packet size more efficiently.

Not satisfied with this improvement, Chuck Forsberg continued with his
development activities and came up with YMODEM BATCH.  This allowed
rapid transmission of a group of files sequentially, to reduce the
overhead associated with keyboard entries to set up the communications
programs at both ends with the transfer of each file.

While YMODEM is referred to as a protocol, it really is a "method" of
file transfer.  The CRC protocol is still in use at its heart, no
matter if in 128 or 1024 character block size.

Ever vigilant to technological developments, Chuck continued to
perceive opportunities for further improvements and has recently
developed and released to public domain a new file transfer protocol
which he calls ZMODEM.  It is a new, sophisticated protocol aimed at
efficient file transfer with time sharing systems, satellite relays,
and wide area packet switching networks.

ZMODEM will work only if both ends support this new protocol, but it
has built into it a fall-back routine whereby it will automatically
fall-back to YMODEM protocol, if ZMODEM is not supported at the other
end.   It uses a "streaming" technique whereby data is flowing
continuously, with simultaneous error detection in a moving window of
up to 256 characters, depending on line quality, using the capabilities
of the full duplex network.

This is an oversimplified description of ZMODEM.  It is quite
sophisticated, complex to learn and use, and not yet in widespread use.
No attempt will be made here to describe this in anything other than
this crude overview.  Those interested otherwise are encouraged to read
Chuck Forsberg's paper on his ZMODEM protocol (ZMODEM.DOC).

There are other protocols, some somewhat obscure, some very complex,
and some proprietory.  For example, KERMIT, MNP, BLAST, BISYNC, SDLC,
HDLC, X.25, X.PC, etc., which, with perhaps the exception of Kermit,
are not in widespread use, tend to be tightly bound to the fortunes of
their suppliers, and which the average users will not likely encounter.
Suffice to note their presence so those interested may do additional
research.

 
BAUD RATES
 
While not normally a function of file transfer methods, it does seem
appropriate to briefly consider the speed at which data flows through
the telephone wire.  BAUD is simply an international unit of
measurement that has become synonymous with BPS (Bits Per Second).  The
latter has come into popular useage, and tends to be a much more
meaningful term.

Most users will encounter modem/computer/communications system
configurations using baud rates of 300 BPS, 1200 BPS, and 2400 BPS.
Lower or higher baud rates are still extremely rare.  By a huge margin,
the most popular is 1200 Bits Per Second, and is the one most
frequently recommended, and at modest cost.  300 baud configurations
should be avoided for they deliver data through the wire at painfully
slow speeds.  In fact, 300 baud becomes cost prohibitive if employed
over long distances.

By way of comparison of how long it may take to transfer a file over
the wire at various baud rates, consider the following example of a
typical file taking 24 minutes to pass through the wire at 300 baud:

     File transferred at 300 baud  - 24 minutes

     Same file at 1200 baud        -  6 minutes

     Same file at 2400 baud        -  3 minutes

Modems capable of transferring files at baud rates higher than 2400 are
available, but they are complex, expensive, and typically require the
identical modem at both ends, because of the absence to date of
consistent universal standards of methods of file transfer at 4800 and
9600 baud.  These standards will ultimately emerge, but for the
present, most users will likely choose to stay with proven techniques
at baud rates of 2400 or 1200.

COMMUNICATIONS TOOLS

There is no shortage of software out there to achieve reliable data
communications, using these protocols.  It ranges from costly dedicated
utilities, such as for AES equipment, to low cost generic systems
placed into the world of "ShareWare" software.  A few of the more
prominent of these ShareWare products are:

PROCOMM v. 2.42 - Excellent,  supports all protocols
                  but ZMODEM

QMODEM v 2.4    - Very good, supports most protocols.

ZCOMM v 2.0     - Excellent,  but complex.  Supports
                  all protocols discussed here.

These are good, and they are cheap.  As with most ShareWare software
products, prices in the $40 - $60 (U.S.) range are commonplace.  There
are others, too many to discuss here.  See them out, do your homework,
choose that which suits you best.

 
SUMMARY
 
Most users will be presented with the following optional methods of
transfering files from one micro-computer to another:

1. ASCII

   Straight one way trip of data without any form of error detection in
   place.  Highly vulnerable to data corruction by normal line noise
   adding extraneous characters to the file.

2. XMODEM

   Very early method of file transfer, using primitive CheckSum
   protocol, at a fixed 128 character block size.  Risk of a few
   obscure errors slipping through.

3. XMODEM CRC

   Reliable method of file transfer using the CRC protocol at a fixed
   128 character block size.  Not very efficient, but highly compatible
   with most communications systems.

4. YMODEM

   Reliable method of file transfer using the CRC protocol at both 128
   and 1024 character size blocks.  Reasonably efficient, and
   reasonably compatible with many communications systems.

5. YMODEM BATCH

   Reliable method of file transfer using the CRC protocol at both 128
   and 1024 character size blocks, with an added option of sending a
   number of files in Batch mode.  Quite efficient, and marginally
   compatible with a few communications systems.

6. ZMODEM

   Sophisticated, reliable, and efficient method of file transfer,
   using a modified CRC protocol of up to 256 character block size with
   auto step-down in accordance with line quality.  Marginally
   compatibility with very few communications systems.  Currently
   rarely found.

 
TYPICAL APPLICATIONS
 
1. Generic public domain software may be reliably transferred between 2
   computers having incompatible disk formats.  If the 2 computers are
   together in the same room, they may simply be connected together
   through their serial ports, then 2 compatible communications systems
   may then facilitate file transfer between the 2 at very high speeds
   (baud rates).  Over distance, the same can be achieved with modems
   at both ends, and "talking" over the voice telephone network at much
   lower baud rates.  The CRC protocol virtually guarantees accurate
   transfer.

2. Text files, saved in ASCII form, may be transferred over the
   telephone wires to most any location.  Typical application  might be
   the content of a magazine article, or a book, where the author
   finalizes content as to language and spelling, etc., and transmits
   it to a printer.  While faithfully preserving content, the printer
   sets it up as to publication format, type style, etc., and can go
   directly to press.  A "proof" copy is not sent back to the author
   for proofreading.  The CRC  protocol virtually guarantees error free
   transfer, where ASCII would be a disaster.

3. Business data, such as accounting, inventory, sales, etc.,  may be
   reliably transferred, using the CRC protocol from a remote site, to
   a central computer for consolidated processing.  Also possible to
   set up this kind of file transfer as an automatic interrogation
   during the middle of the night when rates are lowest.

 
CONCLUSIONS
 
There was a time (not so long ago) where it was considered quite
inapproriate to use a computer to send files through the voice
telephone network.  Between an absence of standards on file transfer
protocols and line noise, received files were rendered almost totally
useless.

However, this is no longer true.  Reliable file transfer protocols are
now in place, and files may now be transferred between micro-computers
with a high degree of reliability.

While this may mean reduced revenues for some industries, in particular
the publishing industry where transcriptions (and related revenues) can
now be a thing of the past, business and industry can look forward to
substantial improvements in staff productivity and significant
reductions in publishing costs, by the application of the "type-it-
once" principle.

In these tight money times, now is the time for business and industry
to be creative in their use of micro-computers and data communications
capability.

Were it not for the vision and foresight of Ward Christensen and Chuck
Forsberg, and others like them, these wonderful tools would be denied
to us, and companion benefits unattainable.   What they have done is
indeed very much appreciated today.  They are to be commended for their
achievements.


Terry Smythe
Sysop, Z-Node 40
55 Rowand Ave
Winnipeg, Manitoba
Canada   R3J 2N6
(204) 832-3982 (Voice-res)
(204) 832-4593 (Z-Node)
(204) 945-6713
15 Apr 87