Fix for issue #13170 #13176

shalinib-ibm · 2025-04-29T08:51:56Z

Build fails with compilation error on power pc.
This patch fixes the same.

Tested with unit tests run via
cmake --build <build_dir> && cd <build_dir> && make test

Make sure to read the contributing guidelines before submitting a PR

shalinib-ibm · 2025-04-29T09:02:32Z

@slaren After the addition of faster kernels for depth wise 2D convolution in ggml-cpu with this commit, c6e8cc2, we see that build fails with compilation error on power pc . This patch fixes the same. can you please review this PR ?

CISC · 2025-04-29T09:29:02Z

@shalinib-ibm Any chance of adding PPC to the CI build to more easily catch this in the future?

shalinib-ibm · 2025-04-29T10:02:16Z

@CISC agree that it would be nice to have some sort of CI for power arch. Will have to work on it.

slaren

Can you explain why this change is necessary? There are other cases where GGML_F32_VEC is used as a single scalar instead of an array, so it is not clear to me why this is necessary here.

shalinib-ibm · 2025-04-29T16:31:45Z

@slaren, in this piece of code, sum is expected to be a vector.
llama.cpp/ggml/src/ggml-cpu/ops.cpp:6120
6120 | GGML_F32_VEC sum = GGML_F32_VEC_ZERO;
But power9 defines
GGML_F32_VEC as a vector float and GGM_F32_VEC_ZERO as a scalar
defined in lines 371, 372, 343,344 gml/src/ggml-cpu/simd-mappings.h#L344) .

So we are making. a vector of vector with size 1. This way it does not break x86 or arm.
Similar style code can be seen here:

llama.cpp/ggml/src/ggml-cpu/vec.cpp

Line 28 in 5a63980

GGML_F32_VEC sum[GGML_F32_ARR] = { GGML_F32_VEC_ZERO };

Hence this change is necessary to fix the bug on power architectures.

slaren · 2025-04-29T16:39:53Z

GGML_F32_VEC sum[GGML_F32_ARR] = { GGML_F32_VEC_ZERO };

With AVX2 this expands to __mm256 sum[4] = { _mm256_setzero_ps() };, which does not look right to me, it is only initializing the first element with _mm256_setzero_ps().

In any case, it seems to me that the solution would to define GGML_F32_VEC_ZERO to {0.0f} for this architecture, rather than just 0, since it is meant to initialize entire vectors.

Build fails with compilation error on power pc. This patch fixes the same. Tested with unit tests run via --build <build_dir> && cd <build_dir> && make test Signed-off-by: Shalini Salomi Bodapati <[email protected]>

shalinib-ibm · 2025-04-30T06:38:44Z

Hi slaren,
{ __m256_setzero_ps() } : This is the initializer for the sum[4] array. In C/C++, when an array is partially initialised, the remaining elements are automatically set to zero. With AVX2, sum[0] is explicitly initialised to a 256 bit register with all 8 floats set to 0.0f. The remaining elements sum[1], sum[2], sum[3] are implicitly initialized to zero.

As per your suggestion, I have defined GGML_F32_VEC_ZERO to {0.0f} for power. Can you please review ?

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Apr 29, 2025

slaren reviewed Apr 29, 2025

View reviewed changes

Fix for 13170

f65c87f

Build fails with compilation error on power pc. This patch fixes the same. Tested with unit tests run via --build <build_dir> && cd <build_dir> && make test Signed-off-by: Shalini Salomi Bodapati <[email protected]>

shalinib-ibm force-pushed the main_br branch from 52d32c2 to f65c87f Compare April 30, 2025 06:28

shalinib-ibm requested a review from slaren April 30, 2025 09:29

slaren approved these changes Apr 30, 2025

View reviewed changes

slaren linked an issue Apr 30, 2025 that may be closed by this pull request

Compile bug: Build fails on ppc64le #13170

Closed

slaren merged commit 4163137 into ggml-org:master Apr 30, 2025
47 checks passed

shalinib-ibm deleted the main_br branch April 30, 2025 11:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix for issue #13170 #13176

Fix for issue #13170 #13176

Uh oh!

shalinib-ibm commented Apr 29, 2025 •

edited

Loading

Uh oh!

shalinib-ibm commented Apr 29, 2025 •

edited

Loading

Uh oh!

CISC commented Apr 29, 2025

Uh oh!

shalinib-ibm commented Apr 29, 2025

Uh oh!

slaren left a comment

Uh oh!

shalinib-ibm commented Apr 29, 2025 •

edited

Loading

Uh oh!

slaren commented Apr 29, 2025

Uh oh!

shalinib-ibm commented Apr 30, 2025

Uh oh!

Uh oh!

Uh oh!

Fix for issue #13170 #13176

Fix for issue #13170 #13176

Uh oh!

Conversation

shalinib-ibm commented Apr 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shalinib-ibm commented Apr 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CISC commented Apr 29, 2025

Uh oh!

shalinib-ibm commented Apr 29, 2025

Uh oh!

slaren left a comment

Choose a reason for hiding this comment

Uh oh!

shalinib-ibm commented Apr 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

slaren commented Apr 29, 2025

Uh oh!

shalinib-ibm commented Apr 30, 2025

Uh oh!

Uh oh!

Uh oh!

shalinib-ibm commented Apr 29, 2025 •

edited

Loading

shalinib-ibm commented Apr 29, 2025 •

edited

Loading

shalinib-ibm commented Apr 29, 2025 •

edited

Loading