Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Hotfix): Track reason for Error AgentState #7584

Merged
merged 12 commits into from
Mar 31, 2025
12 changes: 6 additions & 6 deletions frontend/__tests__/components/chat/action-suggestions.test.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,11 @@ vi.mock("#/context/auth-context", () => ({
describe("ActionSuggestions", () => {
// Setup mocks for each test
vi.clearAllMocks();

(useAuth as any).mockReturnValue({
githubTokenIsSet: true,
});

(useSelector as any).mockReturnValue({
selectedRepository: "test-repo",
});
Expand Down Expand Up @@ -66,16 +66,16 @@ describe("ActionSuggestions", () => {
it("should have different prompts for 'Push to Branch' and 'Push & Create PR' buttons", () => {
// This test verifies that the prompts are different in the component
const component = render(<ActionSuggestions onSuggestionsClick={() => {}} />);

// Get the component instance to access the internal values
const pushBranchPrompt = "Please push the changes to a remote branch on GitHub, but do NOT create a pull request. Please use the exact SAME branch name as the one you are currently on.";
const createPRPrompt = "Please push the changes to GitHub and open a pull request. Please create a meaningful branch name that describes the changes.";

// Verify the prompts are different
expect(pushBranchPrompt).not.toEqual(createPRPrompt);

// Verify the PR prompt mentions creating a meaningful branch name
expect(createPRPrompt).toContain("meaningful branch name");
expect(createPRPrompt).not.toContain("SAME branch name");
});
});
});
1 change: 1 addition & 0 deletions frontend/src/i18n/translation.json
Original file line number Diff line number Diff line change
Expand Up @@ -2078,6 +2078,7 @@
"tr": "Ajan hız sınırına ulaştı",
"ja": "エージェントがレート制限中"
},

"CHAT_INTERFACE$AGENT_PAUSED_MESSAGE": {
"en": "Agent has paused.",
"de": "Agent pausiert.",
Expand Down
25 changes: 22 additions & 3 deletions openhands/controller/agent_controller.py
Original file line number Diff line number Diff line change
Expand Up @@ -228,11 +228,14 @@ async def _react_to_exception(
e: Exception,
):
"""React to an exception by setting the agent state to error and sending a status message."""
await self.set_agent_state_to(AgentState.ERROR)
# Store the error reason before setting the agent state
self.state.last_error = f'{type(e).__name__}: {str(e)}'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the solution overall, we just may want to check if this works with evals, because they use state.last_error with some particular expectations for some errors:
https://github.com/search?q=repo%3AAll-Hands-AI%2FOpenHands%20last_error&type=code

At a look, I'd say it's fine - there's a function that cares about the exception name, and it seems we're setting it here anyway, and we're changing it only for a handful below?

Copy link
Contributor Author

@malhotra5 malhotra5 Mar 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm good point, it looks like we're just saving the last error msg for eval outputs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a function that cares about the exception name

I can't seem to find this; I only see the following

  1. Callback functions that want to set the last_error attrib to agent state
  2. Evals + resolver that want to store last_error in their outputs

In this case I think this PR should be good to go

Copy link
Collaborator

@enyst enyst Mar 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, and this one:

def is_fatal_evaluation_error(error: str | None) -> bool:

Seems okay, the PR doesn't change those?

Edited to add: it comes from here, using last_error:

if is_fatal_evaluation_error(state.last_error):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thanks a bunch for linking this, it gives me peace of mind!

Yes this PR doesn't change those error messages

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You actually make a very good point, this is not straightforward to find, and since we start using last_error, we will modify it sooner or later. Maybe a comment on the eval usage would help in the future?


if self.status_callback is not None:
err_id = ''
if isinstance(e, AuthenticationError):
err_id = 'STATUS$ERROR_LLM_AUTHENTICATION'
self.state.last_error = 'LLM authentication error'
elif isinstance(
e,
(
Expand All @@ -242,14 +245,24 @@ async def _react_to_exception(
),
):
err_id = 'STATUS$ERROR_LLM_SERVICE_UNAVAILABLE'
self.state.last_error = 'LLM service is unavailable'
elif isinstance(e, InternalServerError):
err_id = 'STATUS$ERROR_LLM_INTERNAL_SERVER_ERROR'
self.state.last_error = 'LLM internal server error'
elif isinstance(e, BadRequestError) and 'ExceededBudget' in str(e):
err_id = 'STATUS$ERROR_LLM_OUT_OF_CREDITS'
# Set error reason for budget exceeded
self.state.last_error = 'budget exceeded (out of credits)'
# Use ERROR state with reason instead of separate state
await self.set_agent_state_to(AgentState.ERROR)
return
elif isinstance(e, RateLimitError):
await self.set_agent_state_to(AgentState.RATE_LIMITED)
return
self.status_callback('error', err_id, type(e).__name__ + ': ' + str(e))
self.status_callback('error', err_id, self.state.last_error)

# Set the agent state to ERROR after storing the reason
await self.set_agent_state_to(AgentState.ERROR)

def step(self):
asyncio.create_task(self._step_with_exception_handling())
Expand Down Expand Up @@ -582,8 +595,14 @@ async def set_agent_state_to(self, new_state: AgentState) -> None:
self.event_stream.add_event(self._pending_action, EventSource.AGENT)

self.state.agent_state = new_state

# Create observation with reason field if it's an error state
reason = ''
if new_state == AgentState.ERROR:
reason = self.state.last_error

self.event_stream.add_event(
AgentStateChangedObservation('', self.state.agent_state),
AgentStateChangedObservation('', self.state.agent_state, reason),
EventSource.ENVIRONMENT,
)

Expand Down
3 changes: 2 additions & 1 deletion openhands/events/observation/agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ class AgentStateChangedObservation(Observation):
"""This data class represents the result from delegating to another agent"""

agent_state: str
reason: str = ''
observation: str = ObservationType.AGENT_STATE_CHANGED

@property
Expand Down Expand Up @@ -113,7 +114,7 @@ def __str__(self) -> str:
f'repo_instructions={self.repo_instructions[:20]}...',
f'runtime_hosts={self.runtime_hosts}',
f'additional_agent_instructions={self.additional_agent_instructions[:20]}...',
f'date={self.date}'
f'date={self.date}',
]
)
else:
Expand Down
2 changes: 1 addition & 1 deletion openhands/runtime/plugins/vscode/settings.json
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
{
"workbench.colorTheme": "Default Dark Modern",
"workbench.startupEditor": "none"
}
}
30 changes: 30 additions & 0 deletions tests/unit/test_agent_controller.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
from openhands.events.action.agent import RecallAction
from openhands.events.event import RecallType
from openhands.events.observation import (
AgentStateChangedObservation,
ErrorObservation,
)
from openhands.events.observation.agent import RecallObservation
Expand Down Expand Up @@ -216,9 +217,17 @@ def on_event_memory(event: Event):
print(f'state: {state}')
events = list(test_event_stream.get_events())
print(f'event_stream: {events}')
error_observations = test_event_stream.get_matching_events(
reverse=True, limit=1, event_types=(AgentStateChangedObservation)
)
assert len(error_observations) == 1
error_observation = error_observations[0]
assert state.iteration == 3
assert state.agent_state == AgentState.ERROR
assert state.last_error == 'AgentStuckInLoopError: Agent got stuck in a loop'
assert (
error_observation.reason == 'AgentStuckInLoopError: Agent got stuck in a loop'
)
assert len(events) == 11


Expand Down Expand Up @@ -621,6 +630,17 @@ def on_event_memory(event: Event):
state.last_error
== 'RuntimeError: Agent reached maximum iteration in headless mode. Current iteration: 3, max iteration: 3'
)
error_observations = test_event_stream.get_matching_events(
reverse=True, limit=1, event_types=(AgentStateChangedObservation)
)
assert len(error_observations) == 1
error_observation = error_observations[0]

assert (
error_observation.reason
== 'RuntimeError: Agent reached maximum iteration in headless mode. Current iteration: 3, max iteration: 3'
)

assert (
state.metrics.accumulated_cost == 10.0 * 3
), f'Expected accumulated cost to be 30.0, but got {state.metrics.accumulated_cost}'
Expand Down Expand Up @@ -837,6 +857,16 @@ def on_event_memory(event: Event):
== 'LLMContextWindowExceedError: Conversation history longer than LLM context window limit. Consider turning on enable_history_truncation config to avoid this error'
)

error_observations = test_event_stream.get_matching_events(
reverse=True, limit=1, event_types=(AgentStateChangedObservation)
)
assert len(error_observations) == 1
error_observation = error_observations[0]
assert (
error_observation.reason
== 'LLMContextWindowExceedError: Conversation history longer than LLM context window limit. Consider turning on enable_history_truncation config to avoid this error'
)

# Check that the context window exceeded error was raised during the run
assert step_state.has_errored

Expand Down
Loading