Skip to content

Simplify max_output_tokens handling in LLM classes #9296

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 23, 2025

Conversation

neubig
Copy link
Contributor

@neubig neubig commented Jun 23, 2025

Summary

This PR simplifies the handling of max_output_tokens in OpenHands LLM classes by removing complex default-setting logic and keeping the value as None when not explicitly set.

Changes Made

Core Changes

  • Removed complex default logic: Eliminated the complex logic in init_model_info() that attempted to determine default values for max_output_tokens based on various conditions
  • Simplified parameter handling: Now if max_output_tokens is None, it remains None instead of being set to computed defaults
  • Conditional parameter inclusion: Updated completion function calls in all LLM classes to conditionally include max_output_tokens only when it's not None

Files Modified

  • openhands/llm/llm.py: Removed lines 489-507 containing complex default determination logic, added conditional parameter inclusion
  • openhands/llm/async_llm.py: Added conditional max_tokens parameter inclusion
  • openhands/llm/streaming_llm.py: Added conditional max_tokens parameter inclusion
  • tests/unit/test_llm.py: Updated test expectations to expect None values instead of computed defaults

Testing

  • ✅ All 32 LLM unit tests pass
  • ✅ Pre-commit hooks pass (ruff, mypy, formatting)
  • ✅ Verified headless mode functionality works with the changes

Benefits

  1. Simplified codebase: Removes unnecessary complexity in determining defaults
  2. Clearer behavior: More predictable behavior when max_output_tokens is not set
  3. Better maintainability: Less complex logic to maintain and debug
  4. Consistent handling: Uniform approach across all LLM implementations

Backward Compatibility

This change maintains backward compatibility - the LLM providers will use their own defaults when max_output_tokens is not provided, which is the expected behavior.

References

  • Addresses user request to simplify max_output_tokens handling
  • Follows the principle of keeping None as None instead of complex default computation

Fixes #9227

@neubig can click here to continue refining the PR


To run this PR locally, use the following command:

docker run -it --rm   -p 3000:3000   -v /var/run/docker.sock:/var/run/docker.sock   --add-host host.docker.internal:host-gateway   -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:89ec682-nikolaik   --name openhands-app-89ec682   docker.all-hands.dev/all-hands-ai/openhands:89ec682

- Remove complex default-setting logic in init_model_info()
- Keep max_output_tokens as None if not explicitly set
- Use conditional parameter inclusion in completion calls
- Update tests to expect None values instead of computed defaults

This simplifies the codebase by removing unnecessary complexity
in determining max_output_tokens defaults.
Remove unnecessary conditional dictionary building.
Since litellm.completion accepts Optional[int] = None for both
max_completion_tokens and max_tokens parameters, we can simply
pass None directly instead of conditionally including the parameters.

This makes the code much cleaner and easier to understand.
The code is self-explanatory and doesn't need verbose comments.
The assertions are clear without explanatory comments.
@neubig neubig requested a review from xingyaoww June 23, 2025 03:08
@neubig neubig added the needs-review The PR author would like someone to review. label Jun 23, 2025
model in self.config.model
for model in ['claude-3-7-sonnet', 'claude-3.7-sonnet']
):
self.config.max_output_tokens = 64000 # litellm set max to 128k, but that requires a header to be set
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think at least this one needs a quick test with 3.7. I’m not at my computer, but I can do that in a bit

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirm it works well. I think this was a litellm issue at some point in time, it must have been fixed, and we forgot this code here.

@enyst
Copy link
Collaborator

enyst commented Jun 23, 2025

Re: the complex logic here.
Some of it is really old and I’ve wanted before to clean it up. Some of it is newer and was solving an issue (comment inline).

Copy link
Collaborator

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got LLMs to tell me stories, I also tried to look up the history of this code, and I don't find a reason not to clean it out. So let's go with this. If there are issues, we'll find out. 😅

@enyst enyst removed the needs-review The PR author would like someone to review. label Jun 23, 2025
@neubig neubig merged commit 1e33624 into main Jun 23, 2025
28 checks passed
@neubig neubig deleted the simplify-max-output-tokens branch June 23, 2025 10:48
@neubig
Copy link
Contributor Author

neubig commented Jun 23, 2025

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-eval-50 Runs evaluation with 50 instances
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Context length is hit when asking simple question: what day is today?
3 participants