gemma3 in grpo doesn't work when using lmdeploy #3785

kakao-charlie-cs · 2025-04-07T08:50:58Z

Describe the bug
What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程，最好有截图)

I followed this example script for training gemma3-4b in GRPO

https://github.com/modelscope/ms-swift/blob/main/examples/train/grpo/full_lmdeploy.sh

I used gemma3-4b, but no-vision model. So it is generally similar as gemma-3-1b-it.

https://huggingface.co/gghfez/gemma-3-4b-novision

lmdeploy model which is not supported by turbomind doesn't have load_weights method.
Therefore, below lines raised Exception related the method.

ms-swift/swift/trainers/rlhf_trainer/grpo_trainer.py

Lines 558 to 565 in 9860d42

    
           if self.infer_rank >= 0: 
        
               if self.args.async_generate: 
        
                   self._wait_queue() 
        
               if self.args.use_vllm: 
        
                   llm_model = self.engine.inner_model 
        
               else: 
        
                   llm_model = self.engine.engine.engine 
        
               llm_model.load_weights(state_dict.items())

qwen2_5 based model works fine (since it was supported). But gemma3 not work properly.

Your hardware and system info
Write your system info like CUDA version/system/GPU/torch version here(在这里给出硬件信息和系统信息，如CUDA版本，系统，GPU型号和torch版本等)

single node H200-8ea
deepspeed==0.14.5
trl==0.16.1
lmdeploy==0.7.2.post1
torch==2.6.0
CUDA==12.2
ms-swift==3.3.0dev

Additional context
Add any other context about the problem here(在这里补充其他信息)

The text was updated successfully, but these errors were encountered:

RomanticGodVAN · 2025-04-07T11:47:03Z

the same issue

hjh0119 · 2025-04-08T03:13:50Z

The integration with LMDeploy currently only works with the turbomind backend, for non-TurboMind compatible models: plz use vLLM or pt backend

hjh0119 mentioned this issue Apr 8, 2025

grpo lmdeploy warn #3800

Merged

4 tasks

kakao-charlie-cs closed this as completed Apr 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gemma3 in grpo doesn't work when using lmdeploy #3785

gemma3 in grpo doesn't work when using lmdeploy #3785

kakao-charlie-cs commented Apr 7, 2025 •

edited

Loading

RomanticGodVAN commented Apr 7, 2025

hjh0119 commented Apr 8, 2025

gemma3 in grpo doesn't work when using lmdeploy #3785

gemma3 in grpo doesn't work when using lmdeploy #3785

Comments

kakao-charlie-cs commented Apr 7, 2025 • edited Loading

RomanticGodVAN commented Apr 7, 2025

hjh0119 commented Apr 8, 2025

kakao-charlie-cs commented Apr 7, 2025 •

edited

Loading