The most convenient way to archive Unix file systems is to use the tar utility as it is designed to handle special files system entries such as symbolic links, sockets, pipes etc.
However, there tends to be uncertainty on how tar is to be used to either preserve or to resolve links in the archive. This is partly due to the fact that the concept of hard and soft links is not familiar enough and partly due to differences between the command line options needed for different versions of Unix.
This short summary aims at illustrating the use of du and tar on Solaris and Linux because the du command is often used first to find out about the size of the media needed to store the tar archive.
Imagine the following example directory for the subsequent discussion.
$ ls -liFR directory directory: total 88 586399 drwxr-xr-x 2 rhinz sanusers 80 Dec 14 18:26 dira/ 586589 drwxr-xr-x 2 rhinz sanusers 96 Dec 14 18:16 dirb/ 586591 drwxr-xr-x 2 rhinz sanusers 128 Dec 14 18:18 dirc/ 1528385 -rw-r--r-- 1 rhinz sanusers 43438 Nov 17 11:16 file2 directory/dira: total 0 586483 lrwxrwxrwx 1 rhinz sanusers 6 Dec 14 18:26 dangling_link -> ../nil directory/dirb: total 120 585197 -rw-r--r-- 1 rhinz sanusers 36028 Oct 5 09:46 file0 3479578 -rw-r--r-- 2 rhinz sanusers 22575 Nov 6 18:09 file1 directory/dirc: total 48 3479578 -rw-r--r-- 2 rhinz sanusers 22575 Nov 6 18:09 hard_link_to_file1 586597 lrwxrwxrwx 1 rhinz sanusers 13 Dec 14 18:17 symbolic_link_to_file0 -> ../dirb/file0
It contains three subdirectories dira, dirb and dirc and a plain file file2 of 42kB size.
The subdirectory dira contains a dangling link i.e. a symbolic link to a file which does not exist.
The subdirectory dirb contains two plain files file0 (35 kB) and file1 (22kB).
The subdirectory dirc contains a hard link to file1 hard_link_to_file1 and a symbolic link to file0 symbolic_link_to_file0. Note that both file1 in dirb and hard_link_to_file1 in dirc are shown in the listing with i-node number 3479578 in the first column and with 2 as the number of links.
A simple call of du with the -k option to report the size in units of 1024 bytes (i.e. kB) produces the following listing.
$ du -k directory 0 directory/dira 60 directory/dirb 0 directory/dirc 104 directory
This means that the entire directory contains 104 kB data. The subdirectory dirc is reported as 0 kB since the file with the i-node number 3479578 is already counted with the subdirectory dirb where the file was first created.
Calling du only on directory/dirc generates a different result.
$ du -k directory/dirc 24 directory/dirc
Now the file hard_link_to_file1 is counted separately.
The -L option tells du to process symbolic links by using the file or directory which the symbolic link references, rather than the link itself.
$ du -kL directory 0 directory/dira 60 directory/dirb 0 directory/dirc 104 directory
On Solaris, this is exactly the same output as above with the simple du -k call as the symbolic link symbolic_link_to_file0 in dirc points to file0 in dirb which is already counted. This means that in this case the behaviour of du -L with symbolic links is the same as of du with hard links.
Calling du -L only on directory/dirc produces
$ du -kL directory/dirc 60 directory/dirc
Now both the file file0 and file1 are counted.
To make matters complicated, the behaviour of du -L on Linux differs from the above.
$ du -kL directory 0 directory/dira 60 directory/dirb 36 directory/dirc 140 directory
On Linux the symbolic link symbolic_link_to_file0 is resolved though it points to a file which is already reported in the directory/dirb listing. Consequently, directory is now reported as 140 kB in total instead of the 104 kB in total above on Solaris.
Note that the dangling link in
dira has not caused any
error message though it obviously could not be resolved as it points to
a non-existing file.
On Solaris, the option -r
causes the du command
to generate messages about directories that cannot be read, files that
cannot be opened, and so forth, rather than being silent. If an
error is encountered, du
returns with a non-zero exit code.
$ du -kLr directory du: directory/dira/dangling_link: No such file or directory 0 directory/dira 60 directory/dirb 0 directory/dirc 104 directory $ echo "Exit $?" Exit 1
By default, the tar utility archives the symbolic links such that they can be restored when the archive is unpacked and saves hard-linked files only once within the archive.
$ tar cvf archive.tar directory a directory/ 0K a directory/dira/ 0K a directory/dira/dangling_link symbolic link to ../nil a directory/dirb/ 0K a directory/dirb/file0 36K a directory/dirb/file1 23K a directory/dirc/ 0K a directory/dirc/symbolic_link_to_file0 symbolic link to ../dirb/file0 a directory/dirc/hard_link_to_file1 link to directory/dirb/file1 a directory/file2 43K $ ls -lh archive.tar -rw-r--r-- 1 rhinz sanusers 107K Dec 15 12:11 archive.tar
From the size of the file
archive.tar 107kB it can be seen that all three files are stored
only once: file0 36kB,
file1 23kB and
file2 43kB.
The nice comments right to the file or directory name being archived
('xyzK', 'symbolic link to' or 'link to') are provided with
tar's verbose flag
v on Solaris only.
Using tar to only archive the subdirectory dirc which contains a hard link and a soft link:
$ tar cvf dirc.tar directory/dirc a directory/dirc/ 0K a directory/dirc/symbolic_link_to_file0 symbolic link to ../dirb/file0 a directory/dirc/hard_link_to_file1 23K $ ls -lh dirc.tar -rw-r--r-- 1 rhinz sanusers 25K Dec 15 12:50 dirc.tar
On Solaris, the function modifier l makes tar to output an error message if it is unable to resolve all links to the files being archived.
$ tar cvlf dirc.tar directory/dirc a directory/dirc/ 0K a directory/dirc/symbolic_link_to_file0 symbolic link to ../dirb/file0 a directory/dirc/hard_link_to_file1 23K tar: missing links to directory/dirc/hard_link_to_file1 $ echo "Exit $?" Exit 1
If this happens a non-zero exit code is returned and an error message printed.
Specifying the the function modifier h requests tar to follow symbolic links as if they were normal files or directories.
$ tar cvhf archive.tar directory a directory/ 0K a directory/dira/ 0K a directory/dirb/ 0K a directory/dirb/file0 36K a directory/dirb/file1 23K a directory/dirc/ 0K a directory/dirc/symbolic_link_to_file0 36K a directory/dirc/hard_link_to_file1 link to directory/dirb/file1 a directory/file2 43K $ ls -lh archive.tar -rw-r--r-- 1 rhinz sanusers 142K Dec 15 13:05 archive.tar
As a result, the size of the file archive.tar increased from its original 107kB to 142kB because the file file0 (36kB) is stored in it twice.
Note that on Linux tar issues an error message if it is unable to resolve the symbolic links.
$ tar cvhf archive.tar directory directory/ directory/dira/ tar: directory/dira/dangling_link: Cannot stat: No such file or directory directory/dirb/ directory/dirb/file0 directory/dirb/file1 directory/dirc/ directory/dirc/symbolic_link_to_file0 directory/dirc/hard_link_to_file1 directory/file2 tar: Error exit delayed from previous errors $ echo "Exit $?" Exit 2
As seen before with the hard links, on Solaris the function modifier l needs to be specified for tar to output error messages if it is unable to resolve all links to the files being archived.
$ tar cvhlf archive.tar directory a directory/ 0K a directory/dira/ 0K tar: directory/dira/dangling_link: No such file or directory a directory/dirb/ 0K a directory/dirb/file0 36K a directory/dirb/file1 23K a directory/dirc/ 0K a directory/dirc/symbolic_link_to_file0 36K a directory/dirc/hard_link_to_file1 link to directory/dirb/file1 a directory/file2 43K $ echo "Exit $?" Exit 1
The examples were created on a machine running Solaris 10 with the
/usr/bin/du and
/usr/bin/tar commands.
Where a reference to Linux was made,
du (GNU coreutils) 5.93
and tar (GNU tar)
1.15.1 were used.
The GNU tar manual of the Free Software Foundation.
The information on this page was last updated on 15 December 2009.
© Rainer Hinz.