-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Eval,Arch] Update GPTQ eval and add headless_mode
for Controller
#2994
Changes from 5 commits
1953579
3706ef9
0d62fba
5ab4904
272120f
ed52fe7
cd5fb63
5261489
1f79140
fd3ae43
ef12cd8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -68,6 +68,7 @@ def __init__( | |
max_budget_per_task: float | None = MAX_BUDGET_PER_TASK, | ||
initial_state: State | None = None, | ||
is_delegate: bool = False, | ||
headless_mode: bool = False, | ||
): | ||
"""Initializes a new instance of the AgentController class. | ||
|
||
|
@@ -79,10 +80,12 @@ def __init__( | |
max_budget_per_task: The maximum budget (in USD) allowed per task, beyond which the agent will stop. | ||
initial_state: The initial state of the controller. | ||
is_delegate: Whether this controller is a delegate. | ||
headless_mode: Whether the agent is run in headless mode. | ||
""" | ||
self._step_lock = asyncio.Lock() | ||
self.id = sid | ||
self.agent = agent | ||
self.headless_mode = headless_mode | ||
|
||
# subscribe to the event stream | ||
self.event_stream = event_stream | ||
|
@@ -291,7 +294,16 @@ async def _step(self): | |
logger.debug(f'[Agent Controller {self.id}] Delegate step done') | ||
assert self.delegate is not None | ||
delegate_state = self.delegate.get_agent_state() | ||
if delegate_state == AgentState.ERROR: | ||
logger.debug( | ||
f'[Agent Controller {self.id}] Delegate state: {delegate_state}' | ||
) | ||
if delegate_state == AgentState.ERROR or ( | ||
self.headless_mode and delegate_state == AgentState.PAUSED | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I suggest we change There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good idea! pushed the changes in the latest commits. |
||
): | ||
# consider PAUSED state as an error if running in headless mode | ||
# (since user cannot resume on the web interface so agent will hang forever) | ||
# otherwise, PAUSED state is fine | ||
|
||
# close the delegate upon error | ||
await self.delegate.close() | ||
self.delegate = None | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -40,6 +40,7 @@ async def run_agent_controller( | |
sandbox: Sandbox | None = None, | ||
runtime_tools_config: dict | None = None, | ||
sid: str | None = None, | ||
headless_mode: bool = False, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would it be a good idea to default this to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes that makes sense. When running OpenDevin via CLI, there's no way to "continue" once the application is paused. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also agree that this is probably OK. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done in the latest commit to reduce the amount of changes in this PR |
||
) -> State | None: | ||
"""Main coroutine to run the agent controller with task input flexibility. | ||
It's only used when you launch opendevin backend directly via cmdline. | ||
|
@@ -49,6 +50,7 @@ async def run_agent_controller( | |
exit_on_message: quit if agent asks for a message from user (optional) | ||
fake_user_response_fn: An optional function that receives the current state (could be None) and returns a fake user response. | ||
sandbox: An optional sandbox to run the agent in. | ||
headless_mode: Whether the agent is run in headless mode. | ||
""" | ||
# Logging | ||
logger.info( | ||
|
@@ -75,6 +77,7 @@ async def run_agent_controller( | |
max_budget_per_task=max_budget_per_task, | ||
event_stream=event_stream, | ||
initial_state=initial_state, | ||
headless_mode=headless_mode, | ||
) | ||
|
||
# runtime and tools | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is duplicated with the codeact user response above, could we deduplicate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because these were the code we used to run experiments in our paper, maybe we should leave them as is for reproducibility and improve in future iterations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, but because the strings are the same, wouldn't deduplicating be functionally equivalent?