Add LLaVA OneVision model support #7693

RyanJDick · 2025-02-26T22:21:17Z

Summary

This PR adds support for the LLaVA OneVision model type:

The recommended model is available under the "Starter Models" list.
The LLaVA OneVision VLLM invocation can be used for inference. It supports 0-3 input images along with an input prompt.

Example

Output:

The image is a digital illustration that depicts a surreal landscape with a prominent water tower in the foreground. The tower is tall and cylindrical, with a platform at the top that has a railing. It is surrounded by a grassy field with small white flowers. The sky is filled with various celestial bodies, including a large moon and several smaller moons, creating a dreamlike atmosphere. The clouds are fluffy and scattered across the sky, and the overall color palette is warm, with shades of orange, pink, and blue dominating the scene. The art style is reminiscent of a science fiction or fantasy genre, with a focus on imaginative and fantastical elements.

Related Issues / Discussions

N/A

QA Instructions

Test model installation via starter model list
Test that installed LLaVA models appear in the model list.
Test inference with 0 images
Test inference with 1 image
Test inference with 2 images

Merge Plan

Checklist

The PR has a short but descriptive title, suitable for a changelog
Tests added / updated (if applicable)
Documentation added / updated (if applicable)
Updated What's New copy (if doing a release after this PR)

invokeai/app/invocations/llava_onevision_vllm.py

…ments.

…image field inputs

github-actions bot added python PRs that change python files invocations PRs that change invocations backend PRs that change backend files labels Feb 26, 2025

jazzhaiku self-requested a review February 27, 2025 20:57

github-actions bot added frontend PRs that change frontend files services PRs that change app services python-tests PRs that change python tests labels Mar 11, 2025

RyanJDick force-pushed the ryan/vllm branch from 1395c81 to 847adfe Compare March 12, 2025 21:53

RyanJDick marked this pull request as ready for review March 12, 2025 21:57

RyanJDick requested review from psychedelicious, blessedcoolant, maryhipp, hipsterusername, lstein and brandonrising as code owners March 12, 2025 21:57

hipsterusername approved these changes Mar 12, 2025

View reviewed changes

psychedelicious reviewed Mar 12, 2025

View reviewed changes

invokeai/app/invocations/llava_onevision_vllm.py Outdated Show resolved Hide resolved

RyanJDick force-pushed the ryan/vllm branch from 22511d4 to 9f0f25d Compare March 14, 2025 16:33

RyanJDick and others added 12 commits March 18, 2025 10:20

Add LlavaOnevision model type and probing logic.

ae42095

Add LLaVA Onevision model loading and inference support.

df2e7df

Make LLaVA Onevision node work with 0 images, and other minor improve…

29dea6d

…ments.

Fix copy-paste errors.

e7563d4

Add a LLaVA OneVision starter model.

6b7f34f

Add LLaVA OneVision to Config dropdown in UI

c4e4d74

WIP - model selection for LLaVA

51f7c7a

WIP - model selection for LLaVA

ce8a3fc

Ruff formatting

a33be49

Formatting

8bcf09f

Add max_length=3 to the LLaVA OneVision image input field.

d3dc304

typegen

eb1e62b

fix(nodes): add validator to vllm node images field to handle single …

c604611

…image field inputs

psychedelicious force-pushed the ryan/vllm branch from 9f0f25d to c604611 Compare March 18, 2025 00:44

psychedelicious enabled auto-merge (rebase) March 18, 2025 00:44

psychedelicious merged commit 1f86320 into main Mar 18, 2025
15 checks passed

psychedelicious deleted the ryan/vllm branch March 18, 2025 00:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LLaVA OneVision model support #7693

Add LLaVA OneVision model support #7693

RyanJDick commented Feb 26, 2025 •

edited

Loading

Add LLaVA OneVision model support #7693

Add LLaVA OneVision model support #7693

Conversation

RyanJDick commented Feb 26, 2025 • edited Loading

Summary

Example

Related Issues / Discussions

QA Instructions

Merge Plan

Checklist

RyanJDick commented Feb 26, 2025 •

edited

Loading