Support new arch of GLM4 models #2991

guoqingbao · 2025-06-13T08:40:34Z

The latest GLM-4 (0414 version) uses a different architecture. The existing GLM-4 implementation is not compatible with the GLM-4-0414 series. This PR adds support for the new architecture.

Tested case

cargo run --example glm4_new --release --features cuda -- --weight-path /home/data/GLM-4-9B-0414 --prompt "How are you today?"

   Compiling candle-examples v0.9.1 (/home/bob/candle/candle-examples)
    Finished `release` profile [optimized] target(s) in 4.31s
     Running `target/release/examples/glm4_new --weight-path /home/data/GLM-4-9B-0414 --prompt 'How are you today?'`
avx: true, neon: false, simd128: false, f16c: true
temp: 0.80 repeat-penalty: 1.20 repeat-last-n: 64
retrieved the files in 159.358088ms
loaded the model in 3.527865914s
starting the inference loop
How are you today?
I'm just a computer program, so I don't have feelings or emotions. But thank you for asking! How can I assist you today?

31 tokens generated (28.97 token/s)

greenrazer · 2025-07-03T00:39:19Z

I think the new example you created should be combined with the existing glm4 example using some switching logic similar to the gemma example here.

Otherwise, it looks good!

guoqingbao · 2025-07-03T02:27:22Z

I think the new example you created should be combined with the existing glm4 example using some switching logic similar to the gemma example here.

Otherwise, it looks good!

Thanks for the feedback, I will revise this.

…te bugs for old GLM4

guoqingbao · 2025-07-03T04:17:41Z

I think the new example you created should be combined with the existing glm4 example using some switching logic similar to the gemma example here.

Otherwise, it looks good!

As suggested, I’ve integrated both the old and new GLM4 into a single example, using the which argument to distinguish between the two archs. I also fixed issues related to EOS tokens and the chat template for the old GLM4—since the model uses multiple EOS tokens and still requires the chat template to produce correct generation results.

greenrazer

The functionality looks good overall.
I noticed this introduces a dependency on the either crate. To keep our dependency tree minimal, please avoid adding new dependencies if possible. A simple custom enum could replace the Either usage here.
Thanks for working on this PR! :)

Cargo.toml

candle-examples/Cargo.toml

candle-examples/examples/glm4/main.rs

candle-transformers/Cargo.toml

candle-transformers/src/models/glm4.rs

candle-transformers/src/models/glm4_new.rs

guoqingbao · 2025-07-04T09:35:07Z

The functionality looks good overall. I noticed this introduces a dependency on the either crate. To keep our dependency tree minimal, please avoid adding new dependencies if possible. A simple custom enum could replace the Either usage here. Thanks for working on this PR! :)

Thanks for the comments. I have removed either crate by using a custom EosTokenId struct and deserialization pattern.

Support new arch of GLM4 models

bf5f8fc

guoqingbao mentioned this pull request Jun 13, 2025

Support new arch of GLM4 GGUF models #2992

Open

Clippy fix & update ReadMe

bd7e025

Integrate old and new GLM4 into one example & fix eos and chat templa…

d6f73e5

…te bugs for old GLM4

greenrazer requested changes Jul 3, 2025

View reviewed changes

Remove either crate usage

398d44f

guoqingbao requested a review from greenrazer July 4, 2025 09:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support new arch of GLM4 models #2991

Support new arch of GLM4 models #2991

guoqingbao commented Jun 13, 2025

Uh oh!

greenrazer commented Jul 3, 2025

Uh oh!

guoqingbao commented Jul 3, 2025

Uh oh!

guoqingbao commented Jul 3, 2025

Uh oh!

greenrazer left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

guoqingbao commented Jul 4, 2025

Uh oh!

Uh oh!

Support new arch of GLM4 models #2991

Are you sure you want to change the base?

Support new arch of GLM4 models #2991

Conversation

guoqingbao commented Jun 13, 2025

Uh oh!

greenrazer commented Jul 3, 2025

Uh oh!

guoqingbao commented Jul 3, 2025

Uh oh!

guoqingbao commented Jul 3, 2025

Uh oh!

greenrazer left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

guoqingbao commented Jul 4, 2025

Uh oh!

Uh oh!