Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GRPO训练报错 #3769

Open
winni0 opened this issue Apr 5, 2025 · 0 comments
Open

GRPO训练报错 #3769

winni0 opened this issue Apr 5, 2025 · 0 comments

Comments

@winni0
Copy link

winni0 commented Apr 5, 2025

Describe the bug
What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程,最好有截图)
命令行:CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
NPROC_PER_NODE=8
WANDB_API_KEY=XXX
swift rlhf
--rlhf_type grpo
--model /nfs/largemodel/wangjuan/outputs/Qwen2.5-7B-110K-sft5-0403/v2-20250403-092425/checkpoint-7370-merged
--train_type lora
--dataset '/nfs/largemodel/wangjuan/data/law_chinese1/DISC-Law-SFT-Pair-QA-released_alpaca.json'
--torch_dtype bfloat16
--num_train_epochs 1
--max_length 1024
--per_device_train_batch_size 4
--per_device_eval_batch_size 4
--gradient_accumulation_steps 8
--eval_steps 1000
--save_steps 1000
--learning_rate 1e-6
--save_total_limit 2
--logging_steps 5
--output_dir XXX
--warmup_ratio 0.05
--dataloader_num_workers 4
--max_completion_length 1024
--reward_funcs format repetition
--num_generations 4
--system '用户和助手之间的每一段对话。用户提出一个问题,助手解决它。助手都要先在脑海中思考推理过程,然后再向用户提供答案。推理过程和答案分别被包裹在<think></think>以及<answer></answer>标签内,即<think>推理过程在此</think><answer>答案在此</answer>'
--use_vllm true
--vllm_gpu_memory_utilization 0.5
--vllm_max_model_len 2048
--deepspeed zero3
--temperature 1.0
--top_p 1.0
--top_k 80
--log_completions true
--num_infer_workers 8
--tensor_parallel_size 4
--async_generate false
--move_model_batches 16
--offload_optimizer true
--offload_model true
--gc_collect_after_offload true
--report_to 'wandb'
--sleep_level 1

Image

Image

Image

Your hardware and system info
Write your system info like CUDA version/system/GPU/torch version here(在这里给出硬件信息和系统信息,如CUDA版本,系统,GPU型号和torch版本等)
CUDA11.5,H100 8*80G

Additional context
Add any other context about the problem here(在这里补充其他信息)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant