Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

评测参数bug #3770

Open
1212wuhu opened this issue Apr 5, 2025 · 1 comment
Open

评测参数bug #3770

1212wuhu opened this issue Apr 5, 2025 · 1 comment

Comments

@1212wuhu
Copy link

1212wuhu commented Apr 5, 2025

Describe the bug
评测时,模型输出参数被强制性调整为2048。控制台输出部分如下,应该是使用了evalscope后端但未执行参数覆盖
控制台输出(部分)

2025-04-05 12:30:05,346 - evalscope - INFO - Args: Task config is provided with TaskConfig type.
2025-04-05 12:30:05,351 - evalscope - INFO - Check the OpenCompass environment: OK
2025-04-05 12:30:05,362 - evalscope - INFO - Dump task config to /home/dataset-assist-0/zgy/swift/eval_output/opencompass/20250405_123005/configs/task_config_0da48a.yaml
2025-04-05 12:30:05,372 - evalscope - INFO - {
    "model": null,
    "model_id": null,
    "model_args": {
        "revision": "master",
        "precision": "torch.float16"
    },
    "template_type": null,
    "chat_template": null,
    "datasets": [],
    "dataset_args": {},
    "dataset_dir": "/root/.cache/modelscope/hub/datasets",
    "dataset_hub": "modelscope",
    "generation_config": {
        "max_length": 2048,
        "max_new_tokens": 512,
        "do_sample": false,
        "top_k": 50,
        "top_p": 1.0,
        "temperature": 1.0
    },
    "eval_type": "checkpoint",
    "eval_backend": "OpenCompass",
    "eval_config": {
        "datasets": [
            "math"
        ],
        "batch_size": 16,
        "work_dir": "/home/dataset-assist-0/zgy/swift/eval_output/opencompass",
        "models": [
            {
                "path": "checkpoint-44301-merged",
                "openai_api_base": "http://127.0.0.1:8000/v1/chat/completions",
                "key": "EMPTY",
                "is_chat": true
            }
        ],
        "limit": 100,
        "time_str": "20250405_123005"
    },
    "stage": "all",
    "limit": null,
    "eval_batch_size": 1,
    "mem_cache": false,
    "use_cache": null,
    "work_dir": "/home/dataset-assist-0/zgy/swift/eval_output/opencompass/20250405_123005",
    "outputs": null,
    "debug": false,
    "dry_run": false,
    "seed": 42,
    "api_url": null,
    "api_key": "EMPTY",
    "timeout": null,
    "stream": false,
    "judge_strategy": "auto",
    "judge_worker_num": 8,
    "judge_model_args": {}
}
2025-04-05 12:30:06,039 - evalscope - INFO - *** Run task with config: /tmp/tmpxd7_zkjj.py 

04/05 12:30:06 - OpenCompass - INFO - Current exp folder: /home/dataset-assist-0/zgy/swift/eval_output/opencompass/20250405_123005
04/05 12:30:07 - OpenCompass - WARNING - SlurmRunner is not used, so the partition argument is ignored.
04/05 12:30:07 - OpenCompass - INFO - Partitioned into 1 tasks.

运行脚本:

#!/bin/bash
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
swift eval \
    --model /home/dataset-assist-0/zgy/swift/output_sft/v15-20250402-132218/checkpoint-44301-merged \
    --eval_backend OpenCompass \
    --infer_backend vllm \
    --eval_limit 100 \
    --eval_dataset math \
    --max_model_len 27000 \
    --stream true \
    --tensor_parallel_size 4

可以看到,即使指定了max_model_len,也会被强制设定为 "max_length": 2048,"max_new_tokens": 512,
从评测输出结果看也是如此

@wnark
Copy link

wnark commented Apr 7, 2025

加上不能选择OpenCompass,VLMEvalKit 这些后端,选择就报错
更新:
需要根据提示安装所需的库,基础的顺序是:

pip install ms-swift -U
pip install evalscope
pip install 'evalscope[opencompass]'
pip install vllm==0.8.0 # ms-swift 需要旧版本的transformers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants