Skip to content

Fix for issue #13170 #13176

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 30, 2025
Merged

Fix for issue #13170 #13176

merged 1 commit into from
Apr 30, 2025

Conversation

shalinib-ibm
Copy link
Contributor

@shalinib-ibm shalinib-ibm commented Apr 29, 2025

Build fails with compilation error on power pc.
This patch fixes the same.

Tested with unit tests run via
cmake --build <build_dir> && cd <build_dir> && make test

Make sure to read the contributing guidelines before submitting a PR

@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Apr 29, 2025
@shalinib-ibm
Copy link
Contributor Author

shalinib-ibm commented Apr 29, 2025

@slaren After the addition of faster kernels for depth wise 2D convolution in ggml-cpu with this commit, c6e8cc2, we see that build fails with compilation error on power pc . This patch fixes the same. can you please review this PR ?

@CISC
Copy link
Collaborator

CISC commented Apr 29, 2025

@shalinib-ibm Any chance of adding PPC to the CI build to more easily catch this in the future?

@shalinib-ibm
Copy link
Contributor Author

@CISC agree that it would be nice to have some sort of CI for power arch. Will have to work on it.

Copy link
Member

@slaren slaren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why this change is necessary? There are other cases where GGML_F32_VEC is used as a single scalar instead of an array, so it is not clear to me why this is necessary here.

@shalinib-ibm
Copy link
Contributor Author

shalinib-ibm commented Apr 29, 2025

@slaren, in this piece of code, sum is expected to be a vector.
llama.cpp/ggml/src/ggml-cpu/ops.cpp:6120
6120 | GGML_F32_VEC sum = GGML_F32_VEC_ZERO;
But power9 defines
GGML_F32_VEC as a vector float and GGM_F32_VEC_ZERO as a scalar
defined in lines 371, 372, 343,344 gml/src/ggml-cpu/simd-mappings.h#L344) .

So we are making. a vector of vector with size 1. This way it does not break x86 or arm.
Similar style code can be seen here:

GGML_F32_VEC sum[GGML_F32_ARR] = { GGML_F32_VEC_ZERO };

Hence this change is necessary to fix the bug on power architectures.

@slaren
Copy link
Member

slaren commented Apr 29, 2025

GGML_F32_VEC sum[GGML_F32_ARR] = { GGML_F32_VEC_ZERO };

With AVX2 this expands to __mm256 sum[4] = { _mm256_setzero_ps() };, which does not look right to me, it is only initializing the first element with _mm256_setzero_ps().

In any case, it seems to me that the solution would to define GGML_F32_VEC_ZERO to {0.0f} for this architecture, rather than just 0, since it is meant to initialize entire vectors.

Build fails with compilation error on power pc.
This patch fixes the same.

Tested with unit tests run via
 --build <build_dir> && cd <build_dir> && make test

Signed-off-by: Shalini Salomi Bodapati <[email protected]>
@shalinib-ibm
Copy link
Contributor Author

Hi slaren,
{ __m256_setzero_ps() } : This is the initializer for the sum[4] array. In C/C++, when an array is partially initialised, the remaining elements are automatically set to zero. With AVX2, sum[0] is explicitly initialised to a 256 bit register with all 8 floats set to 0.0f. The remaining elements sum[1], sum[2], sum[3] are implicitly initialized to zero.

As per your suggestion, I have defined GGML_F32_VEC_ZERO to {0.0f} for power. Can you please review ?

@shalinib-ibm shalinib-ibm requested a review from slaren April 30, 2025 09:29
@slaren slaren linked an issue Apr 30, 2025 that may be closed by this pull request
@slaren slaren merged commit 4163137 into ggml-org:master Apr 30, 2025
47 checks passed
@shalinib-ibm shalinib-ibm deleted the main_br branch April 30, 2025 11:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Compile bug: Build fails on ppc64le
3 participants