Skip to content

Fix problem with clang-14.0.0 and reference gemm ukr. #854

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 7, 2025

Conversation

devinamatthews
Copy link
Member

Details:

  • clang 14.0.0 apparently makes some invalid assumptions about whether or not the AB microtile is initialized in the gemm reference microkernel. This leads to the "scale by alpha" part doing something strange (all sorts of random and even NaN values pop up). I do not know why this only manifested for ztrsm on skx (in zgemm_skx_ref via zgemmtrsm_skx_ref). See NaN encountered in SKX ztrsm (no 1m) #852.
  • Aliasing the AB microtile (in the proper datatype) as a pointer to a raw character array, and then initializing the character array with = { 0 } convinces the compiler to do the right thing.
  • The problem did not occur in 14.0.6 or 15.0.7. It may only be a narrow band of versions which are problematic.
  • This commit adds the char array workaround and fixes NaN encountered in SKX ztrsm (no 1m) #852.

Details:
- clang 14.0.0 apparently makes some invalid assumptions about whether
  or not the AB microtile is initialized in the `gemm` reference
  microkernel. This leads to the "scale by alpha" part doing something
  strange (all sorts of random and even NaN values pop up). I do not
  know why this only manifested for `ztrsm` on `skx` (in
  `zgemm_skx_ref` via `zgemmtrsm_skx_ref`). See #852.
- Aliasing the AB microtile (in the proper datatype) as a pointer to
  a raw character array, and then initializing the character array
  with `= { 0 }` convinces the compiler to do the right thing.
- The problem did not occur in 14.0.6 or 15.0.7. It may only be a narrow
  band of versions which are problematic.
- This commit adds the char array workaround and fixes #852.
@devinamatthews devinamatthews merged commit 028be42 into master Feb 7, 2025
2 of 3 checks passed
@devinamatthews devinamatthews deleted the fix-ztrsm-skx branch February 7, 2025 05:22
devinamatthews added a commit that referenced this pull request Feb 7, 2025
Details:
- clang 14.0.0 apparently makes some invalid assumptions about whether
  or not the AB microtile is initialized in the `gemm` reference
  microkernel. This leads to the "scale by alpha" part doing something
  strange (all sorts of random and even NaN values pop up). I do not
  know why this only manifested for `ztrsm` on `skx` (in
  `zgemm_skx_ref` via `zgemmtrsm_skx_ref`). See #852.
- Aliasing the AB microtile (in the proper datatype) as a pointer to
  a raw character array, and then initializing the character array
  with `= { 0 }` convinces the compiler to do the right thing.
- The problem did not occur in 14.0.6 or 15.0.7. It may only be a narrow
  band of versions which are problematic.
- This commit adds the char array workaround and fixes #852.

(cherry picked from commit 028be42)
devinamatthews added a commit that referenced this pull request Feb 7, 2025
Details:
- clang 14.0.0 apparently makes some invalid assumptions about whether
  or not the AB microtile is initialized in the `gemm` reference
  microkernel. This leads to the "scale by alpha" part doing something
  strange (all sorts of random and even NaN values pop up). I do not
  know why this only manifested for `ztrsm` on `skx` (in
  `zgemm_skx_ref` via `zgemmtrsm_skx_ref`). See #852.
- Aliasing the AB microtile (in the proper datatype) as a pointer to
  a raw character array, and then initializing the character array
  with `= { 0 }` convinces the compiler to do the right thing.
- The problem did not occur in 14.0.6 or 15.0.7. It may only be a narrow
  band of versions which are problematic.
- This commit adds the char array workaround and fixes #852.

(cherry picked from commit 028be42)
(cherry picked from commit a0d7f26)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NaN encountered in SKX ztrsm (no 1m)
1 participant