The Source of All Tape Knowledge

by VaX#n8

"Knowing what I know, I'd say they are smarter than a sea turtle
but dumber than Danish physicist Neils Bohr."
-- MST3k

Definitions

Filemark
A chunk of "out-of-band" info which is usually written to a tape to indicate a end-of-file condition. Two in sequence tend to indicate end-of-tape. Unix creates a filemark on the tape when you close the tape device after writing to it.
Inter-Record Gaps
A "blank space" on the tape which indicates the division between two records (not files).
Block or Record
Unfortunately for Unix, tapes work in a "block" format, with framing around it, sort of like a RS-232's serial line's start and stop bits. On some media types, the block size is adjustable; on others, it is not. The unfortunate bit is that Unix likes to view things as octet (8-bit byte) streams, not streams of some particular block size. This "impedance mismatch" is what causes most of the complications one encounters when using tapes in Unix.
The term blocks and records tend to be used interchangeably.
Blocking Mode
Some tape drives can be in one of two modes; fixed block size and variable block size. In fixed block size, the block size is n, for some n > 0. If you set the block size n = 0, then the drive is put in variable blocking mode. Some tape drives, however, have a permanent fixed block size. When in fixed block mode, it is usually an error to try to write(2) partial blocks.

Other Info Sources

How the Tools Work

Generic Tape Handling

Under Unix, tape handling semantics are tricky. After an initial open a read will read one tape block (as determined by the device driver), or the requested number of bytes, whichever is smaller. This is seen by the program as a partial read. Because this is usually interpreted as an error condition, some programs may function incorrectly.

A write command has different effects depending on the status of the drive. If the drive is in variable blocking mode, it will write the entire buffer out to the tape as one block. If it is in fixed blocking mode, it will pad the block with NULs if the requested size < block size. In both cases it will only write a single block of data. Many programs (such as dd) will fail if they try to write more data than will fit in a block, and only a partial write is performed, since they interpret a write(2) call returning a smaller value as an error.

If you try and read(2) a filemark, Unix will skip over the filemark and not put any data into the input buffer, and your read(2) will return a count of zero. This is interpreted as end-of-file by most programs, hence the normal connotation that a filemark equals an "end-of-file".

You cannot write(2) a filemark. It is done by the device driver hardware, usually when you close(2) a file descriptor opened on the tape device.

tar

The name tar stands for "tape archive".

The tar program is designed to write data out in blocks; it usually writes data out in an integer multiple of 512 bytes, as defined by the "b" option. For example, using "-b 20" under GNU tar makes it write 20x512=10240 bytes per write(2) call.

An interesting tidbit is that certain tar programs appear to recognize the last block of a tar archive, and stop reading. This means that they don't consume the filemark that comes after the tar file itself.

From: Danny Willis

Different versions of tar seem to handle the end of a tar file differently.

Sometimes they write a short block at the end of the tar file, sometimes they fill with zeros, sometimes they fill with garbage.

Then when different versions of tar read those different concepts they bomb, get confused, etc. For instance, we recently had a case where the customer's tar prompted for another tape after the end of the tar file.

File Formats

There are, to my knowledge, several slightly different tar file formats:

See also the NetBSD pax(1) manpage and the star manpage.

Note that tar probably will not record all the information your filesystem supports (like ACLs or hard-links). You may prefer to use dump to backup important filesystems.

Solaris 2.4 tar

The SunOS 5.4 tar is "overoptimized". The last block written may be a "short" or "partial" block; I have observed final blocks of size 2048 (4x512b) and 2560 (5x512b) instead of 10240 (20x512b). Thus, if you write one of these "short tar files" and try and read it with some other tar programs, they sometimes error in strange ways (like prompting for another tape).

AIX tar

AIX 4.* tar has a small but strange bug. If you tar file does not include leading directories (for example, if you specified a file list instead of a directory hierarchy to tar up), it creates them with the wrong permissions (0440), and then cannot write the files or subdirectories into them. This is documented in APAR IX57029, and fixed by PTF U442839.

AIX actually documents the tar file format in their man page (hooray for them). Take a look at the converted for HTML version.

GNU tar

All around, the coolest of the bunch. Go visit its home page.

My recommendation is to use this tar when generating tar files, even when they are going to be read by the native tar on a system. I have had no problems using it. I can't recommend it enough. It is conservative in what it generates, and liberal in what it accepts, like all good programs.

Unfortunately, the code is a mess, and I don't think it exits with an unsuccessful (nonzero) value if it hits an end-of-tape. I have also heard it's not POSIX compliant.

Der Mouse's tar

This is something der Mouse wrote to correctly catch all filesystem information (at least on his system).
I find I have two versions of it which I need to sit down and reconcile against one another. In case you want to play with it in the meantime, check out the tar.* files here and tell me what you think. (tar.c is one of the two source files I find handy, tar.c+ is the other. tar.1 is a manpage, tar.README should probably be fetched and read before doing anything with the rest. Hm, I hope I don't find yet more versions lying around....)

I wrote it de novo quite some time ago and have been improving it incrementally ever since. One large site I know of is using it now for their live-filesystem backups; it does as well as is possible against the Zwicky backup program torture tests.

star

The star program is a featureful tar. Visit the star homepage for more info.

dd

The name dd stands for "copy and convert". Don't see it? Well, "cc" was already taken for the C compiler, so the author chose the next letter in the alphabet. The syntax has sort of an evil, JCL-like quality to it. According to The Jargon File, the interface was a prank.

Using dd

Most people use dd incorrectly. This is because dd is a piece of junk that should be replaced.

For example, one common misusage of dd is to try and get 64k blocks written to the tape with this command:

tar -cf - args... | dd of=/dev/rmt8 bs=64k
This won't work because (as you will see below), the bs argument gives you only one buffer. The dd process will attempt to read 64k chunks from the pipe into this buffer, but will only receive a maximum of PIPE_BUF bytes (usually 4 or 8k). It will then write this buffer out to the tape as a single record (it will not pad this block to 64k, fortunately).

GNU dd

This data taken from GNU fileutils 3.12

When dd starts up, it parses all the arguments on the command line in order. Note that the bs= argument will override any previous ibs= or obs= arguments. If neither the obs nor the ibs argument is presented, and bs is given, and no character-translation conversions are performed, then only one buffer will be used (more on this later). In all other cases, two buffers (input and output) are used. If you don't specify any *bs args, ibs and obs default to 512.

Next, based on the translations that you have specified, dd builds a translation table. This table is a 256 entry array, specifying a character-by-character mapping that is the composite of all specified translations. The actual order of application of translations is not the same as what is on the command line. It is:

  1. ebcidic_to_ascii
  2. lower_to_upper
  3. upper_to_lower
  4. ascii_to_ebcidic
  5. ascii_to_ibm
Note that not all conversions can be specified at once. You have your choice of only one conv in {ascii,ebcdic,ibm}, {lcase,ucase}, {block,unblock}, {unblock,sync}.

Finally, dd enters the copy stage. It allocates enough room for the input buffer, and if using a two-buffer scenario, allocates an output buffer as well. It performs any skips on the input, then performs any seeks on the output.

The main loop of the copy stage occurs now. It attempts to read input_blocksize characters into the input buffer. Errors here may be trapped, depending on command line options. If a full input block is not read (for example, when reading from a communication line, the end of a file, a pipe or special file, especially tapes), the partial block count is incremented. If the sync option is in effect, partial input blocks are NUL padded and treated as full input blocks.

At this point, if we are single-buffering, we write the block out. TODO: finish up here (I got bored)

AIX dd

AIX dd is broken. From their manpage:
3.    Use the  backup,  tar, or  cpio command instead of the dd com-
  mand whenever possible to copy files to tape.  These commands are
  designed for use with  tape devices.  For more information on us-
  ing tape devices see the  rmt special file.

6.    To ensure that only whole blocks are written to the output
  device (such as an 8mm tape in fixed-block mode), specify the ibs
  flag, the obs flag, and the  conv=sync flag.  The ibs flag must
  be a multiple of the obs flag.

This was hard-won knowledge for me. When dding a tar file directly out to tape, I ended up using:

dd if=foo.tar of=/dev/rmt1.5 ibs=1 obs=10240 conv=sync
Alternatively, you can use catblock, which is more efficient. Note that their comment about ibs being a multiple of obs is simply wrong, as my example demonstrates.

Solaris dd

Solaris dd is also not necessarily broken, but somewhat unintuitive. From the manpage:
When dd reads  from  a  pipe,  using  the  ibs=X  and  obs=Y
operands,  the  output  will  always be blocked in chunks of
size Y.  When bs=Z is used, the output blocks will be  what-
ever was available to be read from the pipe at the time.

In other words, don't expect "bs=Z" to be the same as "obs=Z ibs=Z". That's because, like GNU tar, it probably uses one buffer if you put "bs=Z", whereas "obs=Z ibs=Z" forces it to use two buffers. GNU dd may be subject to the same deficiency here. (TODO: check) I suppose it depends on how it treats a short read.

sdd

The sdd program is a fast dd. Go look at the sdd distribution.

catblock

This one was sent to me by der Mouse. It is designed to perform as many reads as are necessary to fill up a block, then to write that block out to tape in one write command.

Take a look at the C source. I've got a version under GNU autoconf so it compiles real easily under any OS. I'll put it up soon (faster if you bug me).

splitmerge

Der Mouse also has a program called splitmerge that he uses to break files into regularly-sized pieces and feed them to a pipe (not just a single program), so you can do things like pipe it through a cipher and through catblock out to tape.

cattape

Note that the null-byte padding in fixed block mode means that a file written to tape may have extra null bytes when you read it back in. That's why it is not recommended to pipe the tar output to compress before feeding it to the tape, since uncompressing will fail when it hits the null padding. Programs like tar avoid this problem by doing internal padding and recordkeeping.

When hitting EOT you might get a partial block written out; obviously, you have to have a method for either starting where you left off, or starting the last block over when you start to write to a new tape. Tools like splitmerge require you to guess a size which will fit on all your tapes, precluding efficient use of mixed-media or compression.

I have what I believe to be a general-purpose solution to both these problems. It will allow you to pipe to or from a tape (or set of tapes) with impunity, and you will get exactly the same data in and out. I plan on calling the pair of programs tapecat and cattape, or something like that.

Amanda

From: The Amanda Home Page

AMANDA is the Advanced Maryland Automatic Network Disk Archiver. It allows the administrator of a LAN to set up single master backup server to back up multiple hosts to a single large capacity tape drive. AMANDA uses native dump facilities and can back up a large number of workstations running multiple versions of Unix efficiently. For more information, see the postscript file in cs.umd.edu:/pub/amanda/.... You can also find this in the proceedings of the Summer 1991 USENIX conference held in San Antonio, TX.

From: der Mouse

At work, we use amanda, from cs.umd.edu (check around on ftp.cs.umd.edu for the distribution, if you're interested - 2.2.6 is the latest stable version; 2.3.0 is recently out but it's still rather alpha). It works fairly well for us.

dump & restore

BSD comes with dump and restore for archiving filesystems to tape. Uses a child-pool model. Nice. Older than dirt and may have problems using large (realistic) numbers for tape density. May have hard-coded limits on number of files. NOTE: it treats "-f -" differently than "-f /dev/stdout", so use the former if you're writing to a pipe.

pax

POSIX portable archive exchange. See the NetBSD pax(1) manpage.

cpio

Some Linux folks appear to use this.

Breakdown by Tape Type

7 and 9 Track Reel-to-Reel

From: John McGrath

What is generally called a "block" when talking about tape writing software is called a "record" when dealing with the tape controller. This is the smallest unit that can be read or written. The controller writes "records" to the tape, separated by Inter-Record Gaps. It can also write "tape marks", which are often used to delimit files. Two successive tape marks without intervening records indicates the end-of-tape.

It is interesting to note that the Inter-Record Gaps were physically rather large, approximately 3/4 of an inch. This meant that the record (block) size had a rather profound effect on tape capacity. On a 6250 bpi drive, 80 byte records would occupy 80/6250 = 0.128 inches. With a 3/4 inch gap between each record, this meant that the tape held 1.7% data and 98.3% Inter-Record Gap.

Half-Inch Tapes

From: der Mouse

Back in the old half-inch days, tapes were streams of blocks, with "magic" blocks called "filemarks" interspersed. A block was from one byte upwards, with the upper limit depending on hardware and in some cases software. On the VAX it was 64K (maybe 64K-1) for hardware reasons (16-bit byte count registers), lowered to 63K by software, apparently to catch small negative byte counts.

When reading, if you supplied a buffer smaller than the size of the record, you got an overrun error, which the driver pushed back to user-land only if the driver author felt like it.

TODO: do they support variable and fixed-block sizes
TODO: how do they signify end-of-tape

Quarter Inch Cassettes

From: Christoph Badura

QIC-525 drives use 1024 byte blocks on the medium (all previous QIC formats used 512 bytes). There is special compatibility code in the Tandberg firmware to allow the drive to accept requests for 512 byte blocks.

From: der Mouse

Quarter-inch cartridge tapes broke this notion (the half-inch tape's notion of blocks) rather badly. They appear to be streams of 512-byte blocks, with interspersed filemarks. Provided your transfer counts are all multiples of 512, it doesn't matter whether the read and write sizes match - it's rather like reading and writing a plain file, except that the grain size is 512 bytes rather than one byte.

8mm Tapes

From IBM

With the 8MM TAPE DRIVE, the use of a fixed block size which is not a multiple of 1K is inefficient. The 8mm tape drive always writes internally in 1K blocks. It simulates the effect of variable block sizes, but, for example, using a fixed block size of 512 bytes (or using variable block size and write()ing 512 bytes at a time) wastes one half of the tape capacity and decreases the maximum transfer rate.

4mm Tapes

4mm tapes (also known as DDS, or DAT Data Standard) is a helical-scan format on a manufactured plastic cartridge that has the same form factor as DAT (Digital Audio Tape). DDS-2 has more narrow tracks than DDS, and can fit 50% more on a given length of tape. DDS-3 tapes will give DDS-2 drives the fits.
From Bill Hassell

The DDS format is much more complicated than that, but the basic recording block is aproximately 129 Kbytes. As long as you keep data streaming in, the data blocks will be filled by the drive electronics. Inter-record gaps found in more traditional 9-track tapes do not exist in the DDS format. If you stop the data stream for more than 6-7 seconds, the drive will write what has received so far into a 129K buffer. Thus, writing an 80 byte record once a minute will reduce the DDS capacity to a few megs! DDS cannot be used as an efficient data logger unless a frontend program collects the data into about 129K batches.

Apparently this large recording block is called a superblock.

Experimental evidence shows strange incompatibilities between operating systems using 4mm drives. A 4mm tape written on a Solaris 2.4 box causes our HP-UX A9.01 box to reboot. A 4mm tape written on a Solaris 2.4 box cannot be read (or written to!) by our AIX machine.

4mm Capacity Chart

format length
(m)
capacity
(GB)
density
(bpi)
DDS (DDS-1) 60 1.3 61000
DDS (DDS-1) 90 2 61000
DDS-2 120 4 61000
DDS-3 125 12 122000

4mm Compatibility Chart

From: Simon Muir
                ------ Drive Support --------   Uncomp.
Length  Format  FH-DDS  HH-DDS  DDS-2   DDS-3   Capacity
--------------------------------------------------------
60m     DDS-1   Yes     Yes     Yes     Yes     1.3GB
90m     DDS-1   No      Yes     Yes     Yes     2.0GB
120m    DDS-2   No      No      Yes     Yes     4.0GB
125m    DDS-3   No      No      No      Yes     12.0GB

Notes:

1. "DDS" and "DDS-1" are interchangeable as descriptions
of the original format. "DDS" was originally used to
differentiate the on-tape format from that of audio (DAT).
The cartridge is physically identical, but the DDS format
has much more powerful error correction, and a logical
structure which emulates 1/2" tape. Audio - DAT is a
essentially a continuous stream of small frames, with
limited error correction capability.

2. "Yes" under drive support means *full* support - ie.
unlike some QIC formats, drives will read & write the earlier
formats to be fully readable on earlier-vintage drives.

3. The original full-height DDS drives (FH-DDS, above) only
supported 60m tapes because they had no ability to alter
tension according to tape thickness. FH DDS drives have no
on-board compression either.

4. DDS-3 is a radically different format from the earlier ones,
so the relationship between tape length and capacity doesn't
hold. DDS-4 will be much higher native capacity still...

5. Data Compression: Most typical office data is compressible
by 1.8:1 or 2:1, but YOUR MILEAGE MAY VARY. Eg. GIF files use
a different variant of the same compression method (Lempel-Ziv),
and thus won't be compressed further by a DDS drive.

6. Virtually academic point: In the early days, one manufacturer
brought out the "Data DAT" format, also based on DAT cartridge
technology. IIRC, capacity was roughly the same - 1.3GB, but
the format is not interchangeable with DDS. I doubt if there
are any left alive, as few were made.

7. Totally academic point: The tape size is derived from Imperial
inch formats. Thus in reality it is 1/8" or 3.861mm, not 4mm,
(same as audio Compact Cassettes). I doubt anyone cares :)

DLT

DLT is a high-capacity (10-35GB), high-speed tape drive type.

Breakdown of tar Return Codes and Messages, by OS

SunOS 4.1.3 native tar
EOF = unexpected EOF = 3
EOT = I/O error = 3 

SunOS 5.4 native tar
ok = (no message) = 0
EOF = "tar: blocksize = 0" = 0
EOT = "tar: tape read error" = 3

AIX native tar
no EOF condition testable
EOT = "tar: tape read error: I/O error" = 254

HP-UX A.09.01 native tar
Note that subsequent tar commands read the SAME file, but by clever
use of the mt command you can access filemarks, etc., directly.
EOT/EOF = "Tar: blocksize = 0; broken pipe?" = 3

Multiple Files, One Tape

NOTE: this section may be incorrect; the issues involved here probably have to do with consumption of filemarks by tar and not the device drivers themselves. However, I was using the same tar program on each OS, I think...

Let's say you use "tar" on the no rewind tape device to send three tar files out to the tape.

In actuality, after each close of the tape device, you have written a tape mark. Thus, what it looks like on tape is:

111102203333300

Where 1, 2, 3 are blocks corresponding to tar files 1, 2, and 3 respectively. Each digit represents a block.

What varies between the Unix variants is how this is presented back at you. Let's assume you repeatedly invoke "GNUtar" with the "t" option to list the stuff on the tape.

AIX (tested: 3.x)

1 2 3

AIX does not present the EOF or EOT/EOM marks to you. The first tar command reads the first "file", second command reads the second file, etc.

Sun Solaris 2.x (aka SunOS 5.x)

1230

Solaris presents the null block at the end of the tape. That means that your fourth "tar t" command will show what appears to be an empty tar file.

SunOS 4.x (aka Solaris 1.0)

102030

One null block after every tar file.

HP-UX (tested: 9.x)

1020300

One null block after every file, a second null block at end of tape.

After reading all the files (and null blocks) on the tape, a subsequent tar command will give some kind of error.

Recovering From Disaster

If you happen to write two consecutive filemarks on a tape, you will no longer be allowed to read past the end by the tape firmware. However, all is not necessarily lost. You can start writing a tar file over the filemark, and power down the drive before it writes new filemarks at the end. Note well that Unix takes a dim view of devices which stop responding, so you may panic your system or otherwise need to reboot.

This will not work very well on QIC tapes because they have a serpentine recording pattern; when you write at the beginning of the tape the write (or is it erase) head may be destroying large swatches of data on each following track of the tape, so you may already be screwed. Nonetheless, you may be able to recover a fairly significant amount of data.

TODO:
Rewinding tapes; when does it happen
(depends on drive, firmware; DDS does it on eject, QIC and DLT don't)
End-of-media; how different types of drives can deal with it
Caching tape drives and other quasi-cool stuff (e.g. HP DAT drives)
Show a diagram of tape drive types, and how they indicate EOF, EOM, etc.

High Weirdness

I have been told of drives (particularly caching drives) whose embedded code is buggy; this typically leads to problems which are very difficult to trace.

I have seen tapes which make tape heads dirty the first time they are used, repeatedly. They will typically generate too many errors to be useful.

I have seen many, many drives which will hang the SCSI bus if particular commands are issued in particular orders. For example, I have seen one DAT changer which will hang the SCSI bus if it is told to go "offline" without being rewound. I have also seen QIC drives which take so long to seek to EOM that the OS gives up and dequeues the CCB.

Postamble

Please direct any comments or (especially) additions to me.

All trademarks in this document are property of their respective manufacturers.

Whenever I specify capacity, it is uncompressed.

Do not stare directly into sun.


Go to this level's index
VaX#n8 vax@linkdead.paranoia.com
Original date: 1997
Updated: Thu Mar 19 09:56:06 GMT 1998
Uploaded to Server: