sycl: Fix conditional enabling following arch checks for ggml-sycl #14504

s-Nick · 2025-07-02T13:55:16Z

PR #13973 wanted to enable optimization for intel devices by default. Due to a small boolean bug it was always disabled even with GGML_SYCL_DISABLE_OPT=0. This PR fixes it.

Performance comparison on Intel B580

model	size	params	backend	ngl	sm	test	master t/s	`9edb916` t/s
qwen2 1.5B Q4_0	1013.62 MiB	1.78 B	SYCL	99	none	pp512	8545.67 ± 41.26	8559.06 ± 39.47
qwen2 1.5B Q4_0	1013.62 MiB	1.78 B	SYCL	99	none	tg128	110.97 ± 0.14	157.52 ± 0.53
qwen2 1.5B Q4_K - Medium	1.04 GiB	1.78 B	SYCL	99	none	pp512	8653.84 ± 31.87	8675.45 ± 75.79
qwen2 1.5B Q4_K - Medium	1.04 GiB	1.78 B	SYCL	99	none	tg128	100.36 ± 0.11	137.52 ± 0.20
llama 7B Q4_0	3.57 GiB	6.74 B	SYCL	99	none	pp512	2249.87 ± 3.22	2261.56 ± 4.04
llama 7B Q4_0	3.57 GiB	6.74 B	SYCL	99	none	tg128	41.60 ± 0.17	73.40 ± 0.26
llama 7B Q4_K - Medium	3.80 GiB	6.74 B	SYCL	99	none	pp512	2291.22 ± 1.64	2310.27 ± 4.83
llama 7B Q4_K - Medium	3.80 GiB	6.74 B	SYCL	99	none	tg128	33.19 ± 0.14	59.42 ± 0.54
gemma2 2B Q4_K - Medium	1.59 GiB	2.61 B	SYCL	99	none	pp512	6306.60 ± 17.54	6306.17 ± 23.65
gemma2 2B Q4_K - Medium	1.59 GiB	2.61 B	SYCL	99	none	tg128	70.01 ± 0.77	103.74 ± 0.12
phi3 3B Q4_0	2.03 GiB	3.82 B	SYCL	99	none	pp512	3389.80 ± 2.88	3412.82 ± 7.16
phi3 3B Q4_0	2.03 GiB	3.82 B	SYCL	99	none	tg128	66.12 ± 0.43	107.76 ± 0.30
phi3 3B Q4_K - Medium	2.23 GiB	3.82 B	SYCL	99	none	pp512	3527.64 ± 7.33	3540.96 ± 9.11
phi3 3B Q4_K - Medium	2.23 GiB	3.82 B	SYCL	99	none	tg128	53.77 ± 0.37	79.59 ± 0.36
llama 34B Q6_K	8.20 GiB	10.73 B	SYCL	99	none	pp512	1573.21 ± 2.19	1575.07 ± 1.99
llama 34B Q6_K	8.20 GiB	10.73 B	SYCL	99	none	tg128	21.06 ± 0.04	23.74 ± 0.06

Signed-off-by: nscipione <[email protected]>

* origin/master: Fix conditional enabling following arch checks for ggml-sycl (ggml-org#14504) convert : correct gemma 3n conversion (ggml-org#14450) kv-cache : use ggml_set_rows (ggml-org#14285) ggml : fix FA mask dim 2 and 3 (ggml-org#14505) ggml : remove kompute backend (ggml-org#14501) CUDA: add dynamic shared mem to softmax, refactor general usage (ggml-org#14497)

Fix conditional enabling following arch checks for ggml-sycl

9edb916

Signed-off-by: nscipione <[email protected]>

s-Nick requested a review from Alcpz July 2, 2025 13:55

github-actions bot added ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Jul 2, 2025

Alcpz approved these changes Jul 2, 2025

View reviewed changes

s-Nick merged commit 7b63a71 into ggml-org:master Jul 3, 2025
47 of 48 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

sycl: Fix conditional enabling following arch checks for ggml-sycl #14504

sycl: Fix conditional enabling following arch checks for ggml-sycl #14504

s-Nick commented Jul 2, 2025

Uh oh!

Uh oh!

Uh oh!

sycl: Fix conditional enabling following arch checks for ggml-sycl #14504

sycl: Fix conditional enabling following arch checks for ggml-sycl #14504

Conversation

s-Nick commented Jul 2, 2025

Performance comparison on Intel B580

Uh oh!

Uh oh!

Uh oh!