Skip to content

Issue when running latest Mistral model #4948

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
fciannella opened this issue Mar 31, 2025 · 15 comments
Closed

Issue when running latest Mistral model #4948

fciannella opened this issue Mar 31, 2025 · 15 comments

Comments

@fciannella
Copy link

fciannella commented Mar 31, 2025

I am trying to run mistralai/Mistral-Small-3.1-24B-Instruct-2503

root@batch-block7-00842:~/src# pip list | grep -i trans
hf_transfer                       0.1.9
transformers                      4.50.0
root@batch-block7-00842:~/src# pip list | grep -i sg
msgpack                           1.1.0
msgspec                           0.19.0
sgl-kernel                        0.0.5.post3
sglang                            0.4.4.post2         /root/src/sglang/python 

I am getting this error:

[2025-03-31 08:45:36 TP0] The following error message 'operation scheduled before its operands' can be ignored.
[2025-03-31 08:45:36 TP0] Scheduler hit an exception: Traceback (most recent call last):
  File "/app/sglang/python/sglang/srt/managers/scheduler.py", line 1999, in run_scheduler_process
    scheduler = Scheduler(server_args, port_args, gpu_id, tp_rank, dp_rank)
  File "/app/sglang/python/sglang/srt/managers/scheduler.py", line 249, in __init__
    self.tp_worker = TpWorkerClass(
  File "/app/sglang/python/sglang/srt/managers/tp_worker_overlap_thread.py", line 63, in __init__
    self.worker = TpModelWorker(server_args, gpu_id, tp_rank, dp_rank, nccl_port)
  File "/app/sglang/python/sglang/srt/managers/tp_worker.py", line 74, in __init__
    self.model_runner = ModelRunner(
  File "/app/sglang/python/sglang/srt/model_executor/model_runner.py", line 169, in __init__
    self.initialize(min_per_gpu_memory)
  File "/app/sglang/python/sglang/srt/model_executor/model_runner.py", line 179, in initialize
    self.load_model()
  File "/app/sglang/python/sglang/srt/model_executor/model_runner.py", line 392, in load_model
    self.model = get_model(
  File "/app/sglang/python/sglang/srt/model_loader/__init__.py", line 22, in get_model
    return loader.load_model(
  File "/app/sglang/python/sglang/srt/model_loader/loader.py", line 365, in load_model
    model = _initialize_model(
  File "/app/sglang/python/sglang/srt/model_loader/loader.py", line 144, in _initialize_model
    model_class, _ = get_model_architecture(model_config)
  File "/app/sglang/python/sglang/srt/model_loader/utils.py", line 37, in get_model_architecture
    return ModelRegistry.resolve_model_cls(architectures)
  File "/app/sglang/python/sglang/srt/models/registry.py", line 65, in resolve_model_cls
    return self._raise_for_unsupported(architectures)
  File "/app/sglang/python/sglang/srt/models/registry.py", line 32, in _raise_for_unsupported
    raise ValueError(
ValueError: Model architectures ['Mistral3ForConditionalGeneration'] are not supported for now. Supported architectures: dict_keys(['BaichuanForCausalLM', 'ChatGLMModel', 'CLIPModel', 'CohereForCausalLM', 'Cohere2ForCausalLM', 'DbrxForCausalLM', 'DeepseekForCausalLM', 'MultiModalityCausalLM', 'DeepseekV3ForCausalLMNextN', 'DeepseekV2ForCausalLM', 'DeepseekV3ForCausalLM', 'DeepseekVL2ForCausalLM', 'ExaoneForCausalLM', 'GemmaForCausalLM', 'Gemma2ForCausalLM', 'Gemma2ForSequenceClassification', 'Gemma3ForCausalLM', 'Gemma3ForConditionalGeneration', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GraniteForCausalLM', 'Grok1ForCausalLM', 'Grok1ModelForCausalLM', 'InternLM2ForCausalLM', 'InternLM2ForRewardModel', 'LlamaForCausalLM', 'Phi3ForCausalLM', 'InternLM3ForCausalLM', 'LlamaForClassification', 'LlamaForCausalLMEagle', 'LlamaForCausalLMEagle3', 'LlamaEmbeddingModel', 'MistralModel', 'LlamaForSequenceClassification', 'LlamaForSequenceClassificationWithNormal_Weights', 'LlavaLlamaForCausalLM', 'LlavaQwenForCausalLM', 'LlavaMistralForCausalLM', 'LlavaVidForCausalLM', 'MiniCPMForCausalLM', 'MiniCPM3ForCausalLM', 'MiniCPMO', 'MiniCPMV', 'MistralForCausalLM', 'MixtralForCausalLM', 'QuantMixtralForCausalLM', 'MllamaForConditionalGeneration', 'OlmoForCausalLM', 'Olmo2ForCausalLM', 'OlmoeForCausalLM', 'Phi3SmallForCausalLM', 'QWenLMHeadModel', 'Qwen2ForCausalLM', 'Qwen2_5_VLForConditionalGeneration', 'Qwen2ForSequenceClassification', 'Qwen2ForCausalLMEagle', 'Qwen2MoeForCausalLM', 'Qwen2ForRewardModel', 'Qwen2VLForConditionalGeneration', 'StableLmForCausalLM', 'TorchNativeLlamaForCausalLM', 'TorchNativePhi3ForCausalLM', 'XverseForCausalLM', 'XverseMoeForCausalLM', 'YiVLForCausalLM'])

Is it a support issue or it is just about updating some files? The 2501 version was working!

@fciannella fciannella changed the title Issue when running latest Mistal model Issue when running latest Mistral model Mar 31, 2025
@adarshxs
Copy link
Contributor

adarshxs commented Apr 1, 2025

Hey @fciannella Mistral Small 3.1 is not yet supported. We are in the process of supporting it. You can track updates related to it here: Model coverage

@KivenChen
Copy link
Contributor

I'm selecting VLM on low-spec computes for a use case -- happy to help if support is needed

@KivenChen
Copy link
Contributor

Hi @fciannella @adarshxs, I just created a PR branch for Mistral 3.1 Support (#5099). Currently good with text generation, single image modality and tensor parallel serving. Multi-image modality is yet to be tested.

If you are interested, feel free to take a look and test it out. We'd greatly appreciate anyone's feedback.

@adarshxs
Copy link
Contributor

adarshxs commented Apr 7, 2025

Thanks @KivenChen. Cc @mickqian

@fciannella
Copy link
Author

fciannella commented Apr 7, 2025 via email

@KivenChen
Copy link
Contributor

Still not working for me. Can you add detailed instructions on how to run it?

I am using python3 -m sglang.launch_server --model-path mistralai/Mistral-Small-3.1-24B-Instruct-2503 --host 0.0.0.0 --port 30000 but getting the same error as before.

On Sun, Apr 6, 2025 at 10:37 PM Adarsh Shirawalmath <
@.***> wrote:

Thanks @KivenChen https://github.com/KivenChen. Cc @mickqian
https://github.com/mickqian


Reply to this email directly, view it on GitHub
#4948 (comment),
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ACLQL2KBBMEQCLDKU5ZEACL2YIFKZAVCNFSM6AAAAAB2ES54YGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOOBSGA3TAMBSG4
.
You are receiving this because you were mentioned.Message ID:
@.***>
[image: adarshxs]adarshxs left a comment (#4948)
#4948 (comment)

Thanks @KivenChen https://github.com/KivenChen. Cc @mickqian
https://github.com/mickqian


Reply to this email directly, view it on GitHub
#4948 (comment),
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ACLQL2KBBMEQCLDKU5ZEACL2YIFKZAVCNFSM6AAAAAB2ES54YGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOOBSGA3TAMBSG4
.
You are receiving this because you were mentioned.Message ID:
@.***>

git checkout the dev branch (kivenchen/kgl/kiv__m1stral) and pip install from source (-e) should do the job. If not, the real cause can be found in server debug-level log as "Ignore import error...".

@fciannella
Copy link
Author

fciannella commented Apr 8, 2025 via email

@KivenChen
Copy link
Contributor

KivenChen commented Apr 8, 2025

@fciannella It seems the chat template isn't registered successfully.

You can either:

  1. Pass --chat-template llama-2 (compatible)
  2. Convert JSON to Jinja: copy template string from chat_template.json, print to remove \n, save as .jinja file, use with --chat-template xxx.jinja"

@fciannella
Copy link
Author

Works for me now (I only use it for text for now).

Thank you so much!

@fciannella
Copy link
Author

fciannella commented Apr 10, 2025 via email

@KivenChen
Copy link
Contributor

Haven't tested yet. Since Mistral has their own standards, did their docs mention anything special?

@justicel
Copy link

FYI using tools doesn't work with the branch as-is.

@KivenChen
Copy link
Contributor

KivenChen commented Apr 16, 2025

@justicel It seems this involves sglang's tool call parsers, same for structured output @fciannella. I'll be back with details. Meanwhile I'm working on clearing mistral 3.1's upstream dependencies. #5084

@KivenChen
Copy link
Contributor

FYI using tools doesn't work with the branch as-is.

SGL actually have Mistral tool call parser. Have you tried this tool calling template?

@KivenChen
Copy link
Contributor

FYI structured output works as expected tested with official example

@b8zhong b8zhong closed this as completed May 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants