Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Ollama Empty Text Streaming Issue #1294

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

addypy
Copy link

@addypy addypy commented Mar 29, 2025

Fix Ollama Empty TextPart Streaming

Description

This PR fixes an issue where streaming responses would stop prematurely when encountering empty TextPart responses from the Ollama model (served through OpenAI compatible API). The stream was incorrectly ending early when Ollama returned empty text parts before generating tool calls or additional content.

Changes

  • Modified the stream_to_final function in agent.py to be more robust:
    • Only consider non-empty TextParts as potential final results using has_content() check
    • Allow empty text parts to be processed normally without ending the stream
    • Properly process TextPartDelta events with content
  • Updated test snapshots to match the new behavior:
    • Streaming now produces more granular chunks in tests
    • Token counting is more accurate as we process the complete response

Testing

  • Manually tested with Ollama backend to verify:
    • Streaming now continues properly through empty text parts
    • Tool calls work correctly in streaming mode
    • The streaming behavior is consistent with other model providers
  • Updated test snapshots to reflect the new behavior:
    • Added additional text chunk ('The ') in streaming tests
    • Updated token counts (response_tokens: 5→8, total_tokens: 108→111)

Related Issue

Fixes #1292

Additional Notes

This issue primarily affected Ollama model responses, but the fix enhances streaming robustness across all providers by ensuring only meaningful content triggers a final result.

There is a slight increase in token counting observed in tests after this fix. This is expected and proper behavior - previously, empty text chunks would cause premature stream termination, resulting in incomplete token counting. The fix ensures the complete stream is processed correctly, resulting in more accurate token counts and proper chunking of the streamed response.

@addypy addypy force-pushed the fix-ollama-empty-text-streaming branch 2 times, most recently from 94a56b0 to 4f25359 Compare March 29, 2025 16:13
@addypy addypy changed the title fix: check TextPart.has_content() before considering it as final result fix: Ollama Empty Text Streaming Issue Mar 29, 2025
@addypy addypy force-pushed the fix-ollama-empty-text-streaming branch 3 times, most recently from ccc9733 to b1adec6 Compare March 29, 2025 16:25
This fixes an issue with Ollama where empty TextPart responses before
tool calls would prematurely stop the streaming process.

Fixes pydantic#1292
@addypy addypy force-pushed the fix-ollama-empty-text-streaming branch from 48b24a4 to 34a0c04 Compare March 29, 2025 16:28
@addypy addypy force-pushed the fix-ollama-empty-text-streaming branch from 1572b0b to 7866a18 Compare March 29, 2025 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: Streaming stops prematurely after tool call with Ollama due to empty TextPart
1 participant