lrzip-next vs lrzip vs xz, zpaq, zstd #83

pete4abw · 2022-08-14T18:53:53Z

pete4abw
Aug 14, 2022
Maintainer

Introduction

There are so many ways to compare, dissect, and discuss compression values. But sometimes numbers, not words tell the tale. One big difference between lrzip-next and the others is the pre-compression rzip hash phase. The original versions of lrzip would have the rzip compression level set to the lrzip compression level. So level 4 compression would have level 4 rzip compression, and so on. lrzip-next, however, can set rzip compression independently using the -R# option. This can mean greater compression altogether as the tables below show.

File types can impact performance

For this analysis, I took the entire linux kernel 5.x source code set from 5.4.x all the way to 5.19. The 5 tarballs totaled over 5GB in size. Binary type files or mixed files would perform differently. Since lrzip-next has filter options to preprocess the data to be compressed, additional benefits can be derived (e.g. using --x86). Test system had an Intel(R) Core(TM) i7-1065G7 CPU with 16G of ram, compressing on an SSD. lrzip-next version tested is 0.9.1. lrzip version tested is 0.651.

Filename	File size
linux-5.4.209.tar	940,759,040
linux-5.10.135.tar	1,021,306,880
linux-5.15.59.tar	1,138,503,680
linux-5.18.16.tar	1,215,651,840
linux-5.19.tar	1,269,299,200
===============	===============
Total:	5,585,520,640

Results

Results are contained in two tables. One with actual values and one with various computed indexes that compare performance in different ways. The Size and Time indexes are derived by comparing the compressed size with the worst compressed size, and the fastest compression time to the slowest compression time.Best compression and times are bold. Worst compression and times are bold/italic

Performance table

Program	Level	R-Level	Comp Size	Time	Compression Ratio	BPB	Speed MB/s
lrzip-next:zpaq	9	9	120,706,936	00:20:44.23	46.273	0.173	4.281
lrzip 0.651:zpaq	9	9	129,397,876	00:14:10.17	43.165	0.185	6.266
lrzip-next:zpaq	7	9	140,298,535	00:05:13.11	39.812	0.201	17.016
lrzip 0.651:zpaq	7	7	140,600,852	00:05:28.34	39.726	0.201	16.188
lrzip-next:zpaq	4	9	143,555,321	00:04:57.19	38.908	0.206	17.933
lrzip-next	9	9	170,190,096	00:05:26.73	32.819	0.244	16.287
lrzip 0.651:zpaq	4	4	171,500,714	00:06:50.49	32.568	0.246	12.990
lrzip-next	7	9	174,736,095	00:03:10.28	31.965	0.250	28.032
lrzip-next	7	7	175,541,376	00:02:57.31	31.819	0.251	29.921
lrzip 0.651	9	9	181,141,337	00:02:58.26	30.835	0.259	29.921
lrzip 0.651	7	7	181,246,876	00:02:39.65	30.817	0.260	33.497
lrzip-next	4	9	208,141,679	00:01:21.66	26.835	0.298	65.753
lrzip 0.651	4	4	270,084,676	00:00:55.88	20.681	0.387	95.107
zpaq 7.15	m5		415,925,786	01:12:28.00	13.429	0.596	1.225
zpaq 7.15	m3		595,424,742	00:06:18.86	9.381	0.853	14.055
xz 5.2.5	9		601,080,252	00:10:39.00	9.292	0.861	8.336
xz 5.2.5	6		629,855,900	00:07:31.00	8.868	0.902	11.790
zstd 1.5.2	19		639,735,807	00:10:16.56	8.731	0.916	8.816
zstd 1.5.2	10		740,638,052	00:01:05.35	7.541	1.061	81.511

Index table

This table introduces index comparisons. The lower the index value, the better. The worst Compression and Time indexes are scored 100. The overall index is an average of the size and time indexes. The weighted index halves the time and size indexes respectively to give weight to the other metric.

Program	Level	R-Level	Size Index	Time Index	Overall Index Not Weighted	Overall Index Size Weighted	Overall Index Time Weighted
lrzip-next:zpaq	9	9	16.30	28.62	22.46	15.30	18.38
lrzip 0.651:zpaq	9	9	17.47	19.55	18.51	13.62	14.14
lrzip-next:zpaq	7	9	18.94	7.20	13.07	11.27	8.34
lrzip 0.651:zpaq	7	7	18.98	7.55	13.27	11.38	8.52
lrzip-next:zpaq	4	9	19.38	6.84	13.11	11.40	8.26
lrzip-next	9	9	22.98	7.51	15.25	13.37	9.50
lrzip 0.651:zpaq	4	4	23.16	9.44	16.30	13.94	10.51
lrzip-next	7	9	23.59	4.38	13.98	12.89	8.09
lrzip-next	7	7	23.70	4.08	13.89	12.87	7.96
lrzip 0.651	9	9	24.46	4.10	14.28	13.25	8.16
lrzip 0.651	7	7	24.47	3.67	14.07	13.15	7.95
lrzip-next	4	9	28.10	1.88	14.99	14.52	7.96
lrzip 0.651	4	4	36.47	1.29	18.88	18.55	9.76
zpaq	m5		56.16	100.00	78.08	53.08	64.04
zpaq	m3		80.39	8.71	44.55	42.38	24.46
xz	9		81.16	14.70	47.93	44.25	27.64
xz	6		85.04	10.37	47.71	45.11	26.45
zstd	19		86.38	14.18	50.28	46.73	28.68
zstd	10		100.00	1.50	50.75	50.38	25.75

Conclusion

Two things are obvious.

Both lrzip-next and lrzip outperform the native programs.
The -R# option greatly improves lrzip-next performance as it consistently outperforms all others.

lukypko · 2022-11-11T14:45:07Z

lukypko
Nov 11, 2022

Hello Peter,
if you have a time, would it be possible to add to comparison, these:

bup; providing fast incremental saves and global deduplication (among and within files, including virtual machine images)
lzip, on some files it has better compression ratio then lrzip-next+lzma9
dwarfs
zstd, but please use the highest compression level 22
specify compression and decompression times too

Thank you

1 reply

pete4abw Nov 11, 2022
Maintainer Author

I'll see. There is such a thing as "analysis/paralysis". There will always be something else to compare against, and always one program or another working better on different files. These analyses take a lot of time and effort. And just because a program is better on one system, it may not be on another. So many variables can impact performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lrzip-next vs lrzip vs xz, zpaq, zstd #83

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

lrzip-next vs lrzip vs xz, zpaq, zstd #83

pete4abw Aug 14, 2022 Maintainer

Introduction

File types can impact performance

Results

Performance table

Index table

Conclusion

Replies: 2 comments · 1 reply

lukypko Nov 11, 2022

pete4abw Nov 11, 2022 Maintainer Author

pete4abw
Aug 14, 2022
Maintainer

Replies: 2 comments 1 reply

lukypko
Nov 11, 2022

pete4abw Nov 11, 2022
Maintainer Author