File Compression and Archiving

Sometimes it is useful to store a group of files in one file so that they can be backed up, easily transferred to another directory, or even transferred to a different computer. It is also sometimes useful to compress files into one file so that they use less disk space and download faster.

It is important to understand the difference and relationship between an archive file and a compressed file. An archive file is a collection of files and directories that are stored in one file. The archive file is not compressed — it uses the same amount of disk space as all the individual files and directories combined. A compressed file is a collection of files and directories that are stored in one file and stored in a way that uses less disk space than all the individual files and directories combined. If you do not have enough disk space on your computer, you can compress files that you do not use very often or files that you want to save but do not use anymore. You can even create an archive file and then compress it to save disk space.

ImportantImportant
 

An archive file is not compressed, but a compressed file can be an archive file.

File Compressing

Compressed files use less disk space and download faster than large, uncompressed files. In Red Hat Linux you can compress files with the compression tools gzip, bzip2, or zip.

The bzip2 compression tool is recommended because it provides the most compression and is found on most UNIX-like operating systems. The gzip compression tool can also be found on most UNIX-like operating systems. If you need to transfer files between Linux and other operating system such as MS Windows, you should use zip because it is more commonly used on these other operating systems.

Table 12-1. Compression Tools

Compression ToolFile ExtensionUncompression Tool
gzip.gzgunzip
bzip2.bz2bunzip2
zip.zipunzip

By convention, files compressed with gzip are given the extension .gz, files compressed with bzip2 are given the extension .bz2, and files compressed with zip are given the extension .zip.

Files compressed with gzip are uncompressed with gunzip, files compressed with bzip2 are uncompressed with bunzip2, and files compressed with zip are uncompressed with unzip.

Bzip2 and Bunzip2

To use bzip2 to compress a file type the following command at a shell prompt:

bzip2 filename

The file will be compressed and saved as filename.bz2.

To expand the compressed file, type the following command:

bunzip2 filename.bz2

The filename.bz2 is deleted and replaced with filename.

You can bzip2 multiple files and directories at the same time by listing them with a space between each one:

bzip2 filename.bz2 file1 file2 file3 /usr/work/school 

The above command compresses file1, file2, file3, and the contents of the /usr/work/school directory (assuming this directory exists) and put them in filename.bz2.

TipTip
 

For more information, type man bzip2 and man bunzip2 at a shell prompt to read the man pages for bzip2 and bunzip2.

Gzip and Gunzip

To use gzip to compress a file, type the following command at a shell prompt:

gzip filename

The file will be compressed and saved as filename.gz.

To expand the compressed file, type the following command:

gunzip filename.gz

The filename.gz is deleted and replaced with filename.

You can gzip multiple files and directories at the same time by listing them with a space between each one:

gzip -r filename.gz file1 file2 file3 /usr/work/school 

The above command compresses file1, file2, file3, and the contents of the /usr/work/school directory (assuming this directory exists) and put them in filename.gz.

TipTip
 

For more information, type man gzip and man gunzip at a shell prompt to read the man pages for gzip and gunzip.

Zip and Unzip

To compress a file with zip, type the following command:

zip -r filename.zip filesdir

In this example, filename.zip represents the file you are creating and filesdir represents the directory you want to put in the new zip file. The -r option specifies that you want to include all files contained in the filesdir directory recursively.

To extract the contents of a zip file, type the following command:

unzip filename.zip

You can zip multiple files and directories at the same time by listing them with a space between each one:

zip -r filename.zip file1 file2 file3 /usr/work/school 

The above command compresses file1, file2, file3, and the contents of the /usr/work/school directory (assuming this directory exists) and put them in filename.zip.

TipTip
 

For more information, type man zip and man unzip at a shell prompt to read the man pages for zip and unzip.

File Archiving

A tar file is a collection of several files and/or directories in one file. This is a good way to create backups and archives.

Some of the options used with the tar are:

To create a tar file, type:

tar -cvf filename.tar files/directories

In this example, filename.tar represents the file you are creating and files/directories represents the files or directories you want to put in the archived file.

You can tar multiple files and directories at the same time by listing them with a space between each one:

tar -cvf filename.tar /home/mine/work /home/mine/school

The above command would place all the files in the work and the school subdirectories of /home/mine in a new file called filename.tar in the current directory.

To list the contents of a tar file, type:

tar -tvf filename.tar

To extract the contents of a tar file, type:

tar -xvf filename.tar

This command does not remove the tar file, but it places copies of its contents in the current working directory.

Remember, the tar command does not compress the files by default. To create a tarred and bzipped compressed file, use the -j option:

tar -cjvf filename.tbz

tar files compressed with bzip2 are conventionally given the extension .tbz.

This command creates an archive file and then compresses it as the file filename.tbz. If you uncompress the filename.tbz file with the bunzip2 command, the filename.tbz file is removed and replaced with filename.tar.

You can also expand and unarchive a bzip tar file in one command:

tar -xjvf filename.tbz

To create a tarred and gunzipped compressed file, use the -z option:

tar -czvf filename.tgz

tar files compressed with gzip are conventionally given the extension .tgz.

This command creates the archive file filename.tar and then compresses it as the file filename.tgz. (The file filename.tar is not saved.) If you uncompress the filename.tgz file with the gunzip command, the filename.tgz file is removed and replaced with filename.tar.

You can expand a gzip tar file in one command:

tar -xzvf filename.tgz

TipTip
 

Type the command man tar to read the man page for the tar command.