We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How can I specify the split to use for training and validation?
CUDA_VISIBLE_DEVICES=0 MAX_PIXELS=262144 \ swift sft \ --model LLM-Research/gemma-3-1b-it \ --train_type full \ --dataset 'swift/path-vqa#train' \ --val_dataset 'swift/path-vqa#validation' \ --torch_dtype bfloat16 \ --num_train_epochs 3 \
Of course this will fail as #train is treated as a subset not a split. How can I specify the split?
The text was updated successfully, but these errors were encountered:
https://github.com/modelscope/ms-swift/blob/main/swift/llm/dataset/dataset/mllm.py#L174
use 'modelscope/coco_2014_caption:validation'
Sorry, something went wrong.
Hello @Jintao-Huang, I have to use swift/path-vqa it is not optional.
https://github.com/modelscope/ms-swift/blob/main/swift/llm/dataset/data/dataset_info.json#L612
You may need to modify the source code to resolve the issue; perhaps the following modification:
https://github.com/modelscope/ms-swift/blob/main/swift/llm/dataset/data/dataset_info.json#L105
{ "ms_dataset_id": "swift/path-vqa", "hf_dataset_id": "flaviagiammarino/path-vqa", "subsets": [{ "name": "train", "split": ["train"] }, { "name": "validation", "split": ["validation"] }] "columns": { "question": "query", "answer": "response" }, "tags": ["multi-modal", "vqa", "medical"] },
No branches or pull requests
How can I specify the split to use for training and validation?
Of course this will fail as #train is treated as a subset not a split. How can I specify the split?
The text was updated successfully, but these errors were encountered: