Image Processing Fails with Gemma3 Model via Ollama (Khoj 1.38.0) #1145

consulitsk · 2025-03-31T07:19:42Z

Server

Cloud (https://app.khoj.dev)
Self-Hosted Docker
Self-Hosted Python package
Self-Hosted source code

Clients

OS

Khoj version

1.38.0

Describe the bug

We encountered an issue when using the Khoj 1.38.0 version with the Gemma3 model via Ollama, with vision enabled.

Current Behavior

When an image is submitted, the following errors appear in the logs:

khoj.routers.storage: AWS is not enabled. Skipping image upload
khoj.processor.conversation.openai.utils Error code: 400 -
{'error': {'message': 'invalid image input', 'type': 'invalid_request_error', 'param': None, 'code': None}}

Based on the following issue, this should have been fixed already:
#1112

Is there any additional configuration required to properly handle image inputs with vision enabled? Or could this be a regression?

Expected Behavior

Image inputs should be processed correctly when using the Gemma3 model via Ollama with vision enabled. The image should be accepted and analyzed without errors, and no AWS-related upload warnings should appear if cloud storage is not configured or required.

Reproduction Steps

Run Khoj version 1.38.0.
Use the Gemma3 model via Ollama with vision enabled.
Upload an image in a conversation.

Possible Workaround

No response

Additional Information

No response

Link to Discord or Github discussion

No response

consulitsk added the fix Fix something that isn't working as expected label Mar 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image Processing Fails with Gemma3 Model via Ollama (Khoj 1.38.0) #1145

Image Processing Fails with Gemma3 Model via Ollama (Khoj 1.38.0) #1145

consulitsk commented Mar 31, 2025

Image Processing Fails with Gemma3 Model via Ollama (Khoj 1.38.0) #1145

Image Processing Fails with Gemma3 Model via Ollama (Khoj 1.38.0) #1145

Comments

consulitsk commented Mar 31, 2025

Server

Clients

OS

Khoj version

Describe the bug

Current Behavior

Expected Behavior

Reproduction Steps

Possible Workaround

Additional Information

Link to Discord or Github discussion