Skip to content

Commit 028be42

Browse files
Fix problem with clang-14.0.0 and reference gemm ukr. (#854)
Details: - clang 14.0.0 apparently makes some invalid assumptions about whether or not the AB microtile is initialized in the `gemm` reference microkernel. This leads to the "scale by alpha" part doing something strange (all sorts of random and even NaN values pop up). I do not know why this only manifested for `ztrsm` on `skx` (in `zgemm_skx_ref` via `zgemmtrsm_skx_ref`). See #852. - Aliasing the AB microtile (in the proper datatype) as a pointer to a raw character array, and then initializing the character array with `= { 0 }` convinces the compiler to do the right thing. - The problem did not occur in 14.0.6 or 15.0.7. It may only be a narrow band of versions which are problematic. - This commit adds the char array workaround and fixes #852.
1 parent 5ad37a8 commit 028be42

File tree

1 file changed

+9
-10
lines changed

1 file changed

+9
-10
lines changed

ref_kernels/3/bli_gemm_ref.c

+9-10
Original file line numberDiff line numberDiff line change
@@ -194,16 +194,15 @@ void PASTEMAC(ch,ch,opname,arch,suf) \
194194
return; \
195195
} \
196196
\
197-
ctype ab[ BLIS_STACK_BUF_MAX_SIZE \
198-
/ sizeof( ctype ) ] \
199-
__attribute__((aligned(BLIS_STACK_BUF_ALIGN_SIZE))); \
200-
const inc_t rs_ab = nr; \
201-
const inc_t cs_ab = 1; \
202-
\
203-
const inc_t rs_a = PASTECH(BLIS_BBM_,ch); \
204-
const inc_t cs_a = PASTECH(BLIS_PACKMR_,ch); \
205-
const inc_t rs_b = PASTECH(BLIS_PACKNR_,ch); \
206-
const inc_t cs_b = PASTECH(BLIS_BBN_,ch); \
197+
char ab_[ BLIS_STACK_BUF_MAX_SIZE ] __attribute__((aligned(BLIS_STACK_BUF_ALIGN_SIZE))) = { 0 }; \
198+
ctype* ab = (ctype*)ab_; \
199+
const inc_t rs_ab = nr; \
200+
const inc_t cs_ab = 1; \
201+
\
202+
const inc_t rs_a = PASTECH(BLIS_BBM_,ch); \
203+
const inc_t cs_a = PASTECH(BLIS_PACKMR_,ch); \
204+
const inc_t rs_b = PASTECH(BLIS_PACKNR_,ch); \
205+
const inc_t cs_b = PASTECH(BLIS_BBN_,ch); \
207206
\
208207
\
209208
/* Initialize the accumulator elements in ab to zero. */ \

0 commit comments

Comments
 (0)