Compression policy attribute

The Compression attribute specifies that the backups use the software compression that is based on the policy. Select the check box to enable compression. (Default: no compression.)

The degree to which a file can be compressed depends on the data type. A backup usually involves more than one type of data. Examples include stripped and unstripped binaries, ASCII, and the non-unique strings that repeat. Some data types are more favorable to compression.

Note:

When compression is not used, the server may receive more data than what exists on the client. The discrepancy is due to client disk fragmentation and the file headers that the client adds. (To tell how much space a file occupies, run the du command. To tell how much free disk space is available, run the df command.)

Data types that compress well:

Programs, ASCII files, and unstripped binaries (typically 40% of the original size).

Best-case compression:

Files that are composed of the strings that repeat can sometimes be compressed to 1% of their original size.

Data types that do not compress well:

Stripped binaries (usually 60% of original size).

Worst-case compression:

Files that are already compressed become slightly larger if compressed again. On UNIX clients, if a compressed file has a unique file extension, exclude it from compression by adding it under the Client Settings (UNIX) properties.

The UNIX Client host property to exclude files for compression corresponds to the COMPRESS_SUFFIX =.suffix option to the bp.conf file.

Effect of file size:

File size has no effect on the amount of compression. However, it takes longer to compress many small files than a single large one.

Client resources that are required:

Compression requires client computer processing unit time and as much memory as the administrator configures.

Effect on client speed:

Compression uses as much of the computer processing unit as available and affects other applications that require the computer processing unit. For fast CPUs, however, I/O rather than CPU speed is the limiting factor.

Files that are not compressed:

NetBackup does not compress the following files:

  • Files that are equal to or less than 512 bytes, because that is the tar block size.

  • On UNIX clients, the files that end with suffixes specified with the COMPRESS_SUFFIX =.suffix option in the bp.conf file.

  • On UNIX clients, files with the following suffixes:

    .arc		.gz		 .iff		 .sit.bin 
    .arj		.hqx		.pit		 .tiff 
    .au		 .hqx.bin	.pit.bin	 .Y 
    .cpt		.jpeg		 .scf		 .zip 
    .cpt.bin	.jpg		.sea		 .zom 
    .F		.lha		.sea.bin	 .zoo 
    .F3B		.lzh		.sit		 .z 
    .gif		.pak
    

Compression increases the overhead computing on the client and increases backup time due to the time required to compress the files. The lower transfer rate that is associated with compression on the client reduces the ability of some tape devices (notably 8mm) to stream data. The effect of the lower transfer rate causes additional wear on those devices.

The savings in media and network resources continue to make compression desirable unless total backup time or client computing resources become a problem. If total backup time is a problem, consider multiplexing. The NetBackup multiplexing feature backs up clients in parallel, reducing the total time to back them up.

If compressed data is written to a storage unit that has single-instance store (SIS) capabilities, the storage unit may not be able to use data deduplication on the compressed or the encrypted data. In data deduplication, only one instance of the file is stored. Subsequent instances of the file reference the single file.

Compression reduces the size of a backup by reducing the size of files in the backup. In turn, the smaller backup size decreases the amount of media that is required for storage. Compression also decreases the amount of data that travels over the network as well as the network load.

More Information

Client Settings (UNIX) properties

About the Policy attributes