Skip to content

How to Maximize Compression

Peter Hyman edited this page Nov 24, 2023 · 4 revisions

Maximizing Compression

Normally, using level 9 compression will yield the best results regardless of compression method. Sometimes, with text files, results may differ among levels, but that cannot be anticipated. ZPAQ will notmally provide the best compression at level 9, but will be slow. With binary files, using one of the binary filters -- x86, arm, etc. which use Branch Call Jump analysis to improve compression, will improve results

However, one setting above all can really improve results. That is -p1 which will inhibit multi-threading and allow the largest possible block size to be compressed. Using -p1 for decompression is not beneficial and will only slow things down. Text file compression was far more influenced by -p1.

Here are some results.

Text Files

Target file: enwik8 100,000,000 bytes
Level: 9
File type: Text

Method Compressed Size Time to Compress p1 Compressed Size p1 Time to Compress p1 Compress % p1 Time %
bzip2 28,796,694 5.230 28,738,621 10.100 0.202% -93.117%
bzip3 22,417,476 7.130 21,107,931 14.000 5.842% -96.353%
gzip 35,723,370 4.910 35,714,597 9.210 0.025% -87.576%
lzo 39,797,973 5.710 39,785,060 14.160 0.032% -147.986%
lzma 25,118,871 53.450 25,118,871 53.870 0.000% -0.786%
zpaq 20,332,905 129.080 19,563,012 190.590 3.786% -47.653%
zstd 27,794,089 15.730 25,663,122 68.320 7.667% -334.329%

zpaq had the best compression overall, followed by bzip3. zstd had the greatest compression benefit using -p1, but it also paid the greatest time penalty in percent, 334% more than 4x longer! bzip3 had the next best benefit, followed by zpaq. For thie file, bzip3 and zstd had the fastest and best compression combination.

Binary Files

Target File: bin.tar 100,000,000 bytes
Level: 9 using --x86 filter
File Type: x86 Binary

Method Compressed Size Time to Compress p1 Compressed Size p1 Time to Compress p1 Compress % p1 Time %
bzip2 30,357,383 4.510 30,352,255 10.100 0.017% -123.947%
bzip3 27,643,957 6.470 27,644,451 14.000 -0.002% -116.383%
gzip 31,802,035 5.750 31,809,266 9.210 -0.023% -60.174%
lzo 34,800,547 6.390 34,807,911 14.160 -0.023% -60.174%
lzma 23,973,472 22.340 23,973,472 53.870 0.000% -141.137%
zpaq 20,639,397 110.910 20,260,994 190.590 1.833% -71.842%
zstd 26,436,347 9.950 25,833,383 68.320 2.281% -586.633%

Here, with the exception of zpaq and zstd there is little benefit with a binary/random file when using -p1. From a time perspective, there is a significant time penalty for around a 2% compression benefit.