Skip to content

max_grad_norm不生效的问题 #304

Open
@yiyepiaoling0715

Description

@yiyepiaoling0715

使用firefly 进行 sft ,grad_norm 始终>1
deepseed config gradient_clip 设置auto
image
1
2
max_grad_norm=1.0
max_grad_norm=1.0
3
4
使用Firefly 进行预训练,同样的deepseed配置,这样是ok的生效的,但就是sft的grad_norm不生效
pretrain的grad_norm记录
5

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions