You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Token: Phi4MM supports two types of image token conventions (<|image1|> and <|endoftext10|>), currently we only support the latter. If you use the default chat template, it will automatically pick up the supported one.
Uh oh!
There was an error while loading. Please reload this page.
Update
Currently the basic text + vision support is already in main. However, there are many known issues across the board.
Known limitations: (See Execution Plan before for full list):
LoRA / Image quality: Phi4MM depends on LoRA for full image capability, but there is some compatibility issues with the native SGL LORA solution. We are working on solving it by refactoring / generalizing SGL LoRA capabilities.Fixed with Refactor LoRA handling to support adapter tensors in fused format #6585, Fix incorrect LoRA weight loading for fused gate_up_proj #6734, Support LoRA in TestOpenAIVisionServer and fix fused kv_proj loading bug. #6861)<|image1|>
and<|endoftext10|>
), currently we only support the latter. If you use the default chat template, it will automatically pick up the supported one.Motivation
Supporting the Phi4 Multimodal model (https://huggingface.co/microsoft/Phi-4-multimodal-instruct in SGL.
Execution Plan:
Related resources
No response
The text was updated successfully, but these errors were encountered: