Skip to content

Commit 180f8e4

Browse files
committed
Fixed undefined behavior trsm ukr bug in bdd46f9.
Details: - Fixed a bug that mainfested anytime a configuration was used in which optimized microkernels were registered and the trsm operation (or kernel) was invoked. The bug resulted from the optimized microkernels' register blocksizes conflicting with the hard-coded values--expressed in the form of constant loop bounds--used in the new reference trsm ukernels that were introduced in bdd46f9. The fix was easy: reverting back to the implementation that uses variable-bound loops, which amounted to changing an #if 0 to #if 1 (since I preserved the older implementation in the file alongside the new code based on constant- bound loops). It should be noted that this fix must be permanent, since the trsm kernel code with constant-bound loops can never work with gemm ukernels that use different register blocksizes.
1 parent bdd46f9 commit 180f8e4

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

ref_kernels/3/bli_trsm_ref.c

+1-1
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@
3434

3535
#include "blis.h"
3636

37-
#if 1
37+
#if 0
3838

3939
// An implementation that attempts to facilitate emission of vectorized
4040
// instructions via constant loop bounds + #pragma omp simd directives.

0 commit comments

Comments
 (0)