-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Replace max_iterations with max_llm_calls #628
base: main
Are you sure you want to change the base?
Conversation
fabianvf
commented
Feb 6, 2025
- Add a counter to the model provider to optionally limit the number of requests that can be made.
- Remove all references to max_iterations (except for in the rpc_server params, we can remove that in a few releases)
81e7510
to
61ab9c4
Compare
- Add a counter to the model provider to optionally limit the number of requests that can be made. - Remove all references to max_iterations (except for in the rpc_server params, we can remove that in a few releases) Signed-off-by: Fabian von Feilitzsch <[email protected]>
61ab9c4
to
3e25470
Compare
@@ -22,7 +22,17 @@ | |||
LOG = get_logger(__name__) | |||
|
|||
|
|||
class LLMCallBudgetReached(Exception): | |||
def __init__( | |||
self, message: str = "The defined LLM call budget has been reached" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we say the budget that was set?
class ModelProvider: | ||
|
||
llm_call_budget: int = -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this approach going to cause issues with Jonah's async PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. Each llm call can still check if the budget is reached before actually making the call. Asyncio is cooperatively multitasked
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. wouldn't multple calls to get_code_plan_solution cause the budget for ALL calls in the system to be reset (AFAICT Model provider is shared).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we supporting multiple calls? The whole system kind of breaks down in that case anyway doesn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is, but it might be nice to have this not be one more thing that we have to remember if/when we do have multiple requests was my other thought.
But maybe we have to re-architect the initialization of task managers/agents at that time.
@@ -463,7 +456,9 @@ class GetCodeplanAgentSolutionParams(BaseModel): | |||
file_path: Path | |||
incidents: list[ExtendedIncident] | |||
|
|||
# Deprecated in favor of llm_call_budget |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://docs.pydantic.dev/latest/concepts/fields/#deprecated-fields You can make this a bonafide deprecated field if you want
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or you could alias the other field to the new one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 that's a good idea