Skip to content

Commit 18f04e7

Browse files
authored
Fix: Long input tail in sz_copy_avx512 (#221)
The bytes at the end of the (larger > 1M) inputs were not properly copied.
1 parent a5176f1 commit 18f04e7

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

include/stringzilla/stringzilla.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4824,7 +4824,7 @@ SZ_PUBLIC void sz_copy_avx512(sz_ptr_t target, sz_cptr_t source, sz_size_t lengt
48244824
__mmask64 tail_mask = _sz_u64_mask_until(tail_length);
48254825
_mm512_mask_storeu_epi8(target, head_mask, _mm512_maskz_loadu_epi8(head_mask, source));
48264826
_mm512_mask_storeu_epi8(target + head_length + body_length, tail_mask,
4827-
_mm512_maskz_loadu_epi8(tail_mask, source));
4827+
_mm512_maskz_loadu_epi8(tail_mask, source + head_length + body_length));
48284828

48294829
// Now in the main loop, we can use non-temporal loads and stores,
48304830
// performing the operation in both directions.

0 commit comments

Comments
 (0)