-
Notifications
You must be signed in to change notification settings - Fork 11
How to Maximize Compression
Normally, using level 9 compression will yield the best results regardless of compression method. Sometimes, with text files, results may differ among levels, but that cannot be anticipated. ZPAQ will notmally provide the best compression at level 9, but will be slow. With binary files, using one of the binary filters -- x86, arm, etc. which use Branch Call Jump analysis to improve compression, will improve results
However, one setting above all can really improve results. That is -p1 which will inhibit multi-threading and allow the largest possible block size to be compressed. Using -p1 for decompression is not beneficial and will only slow things down. Text file compression was far more influenced by -p1.
Here are some results.
Target file: enwik8 100,000,000 bytes
Level: 9
File type: Text
Method | Compressed Size | Time to Compress | p1 Compressed Size | p1 Time to Compress | p1 Compress % | p1 Time % |
---|---|---|---|---|---|---|
bzip2 | 28,796,694 | 5.230 | 28,738,621 | 10.100 | 0.202% | -93.117% |
bzip3 | 22,417,476 | 7.130 | 21,107,931 | 14.000 | 5.842% | -96.353% |
gzip | 35,723,370 | 4.910 | 35,714,597 | 9.210 | 0.025% | -87.576% |
lzo | 39,797,973 | 5.710 | 39,785,060 | 14.160 | 0.032% | -147.986% |
lzma | 25,118,871 | 53.450 | 25,118,871 | 53.870 | 0.000% | -0.786% |
zpaq | 20,332,905 | 129.080 | 19,563,012 | 190.590 | 3.786% | -47.653% |
zstd | 27,794,089 | 15.730 | 25,663,122 | 68.320 | 7.667% | -334.329% |
zpaq had the best compression overall, followed by bzip3. zstd had the greatest compression benefit using -p1, but it also paid the greatest time penalty in percent, 334% more than 4x longer! bzip3 had the next best benefit, followed by zpaq. For thie file, bzip3 and zstd had the fastest and best compression combination.
Target File: bin.tar 100,000,000 bytes
Level: 9 using --x86 filter
File Type: x86 Binary
Method | Compressed Size | Time to Compress | p1 Compressed Size | p1 Time to Compress | p1 Compress % | p1 Time % |
---|---|---|---|---|---|---|
bzip2 | 30,357,383 | 4.510 | 30,352,255 | 10.100 | 0.017% | -123.947% |
bzip3 | 27,643,957 | 6.470 | 27,644,451 | 14.000 | -0.002% | -116.383% |
gzip | 31,802,035 | 5.750 | 31,809,266 | 9.210 | -0.023% | -60.174% |
lzo | 34,800,547 | 6.390 | 34,807,911 | 14.160 | -0.023% | -60.174% |
lzma | 23,973,472 | 22.340 | 23,973,472 | 53.870 | 0.000% | -141.137% |
zpaq | 20,639,397 | 110.910 | 20,260,994 | 190.590 | 1.833% | -71.842% |
zstd | 26,436,347 | 9.950 | 25,833,383 | 68.320 | 2.281% | -586.633% |
Here, with the exception of zpaq and zstd there is little benefit with a binary/random file when using -p1. From a time perspective, there is a significant time penalty for around a 2% compression benefit.
Home
About
No More 7/9 Compression Level Limits
Threshold Limits
Using tar
with lrzip-next
Piping with lrzip-next
How to Maximize Compression
HOWTO Speed up Compiling
Feature added: Debugging
What we pass to LZMA
What we pass to ZPAQ
What we pass to BZIP3
What we pass to ZSTD
Computing Memory Overhead
Increasing Block Sizes to Compress
LZMA2 Dictionary Sizes NEW
Setting Cost Factor NEW