You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Revamp AgentRejectAction and allow ManagerAgent to handle rejection (#1735)
* Fix AgentRejectAction handling
* Add ManagerAgent to integration tests
* Fix regenerate.sh
* Fix merge
* Update README for micro-agents
* Add test reject to regenerate.sh
* regenerate.sh: Add support for running a specific test and/or agent
* Refine reject schema, and allow ManagerAgent to handle reject
* Add test artifacts for test_simple_task_rejection
* Fix manager agent tests
* Fix README
* test_simple_task_rejection: check final agent state
* Integration test: exit if mock prompt not found
* Update test_simple_task_rejection tests
* Fix test_edits test artifacts after prompt update
* Fix ManagerAgent test_edits
* WIP
* Fix tests
* update test_edits for ManagerAgent
* Skip local sandbox for reject test
* Fix test comparison
Copy file name to clipboardExpand all lines: tests/integration/mock/ManagerAgent/test_edits/prompt_001.log
+15
Original file line number
Diff line number
Diff line change
@@ -11,6 +11,15 @@ can do the actual work. A description of each agent is provided below. You MUST
11
11
select one of the delegates below to move towards accomplishing the task, and you MUST
12
12
provide the correct inputs for the delegate you select.
13
13
14
+
Note: the delegated agent either returns "finish" or "reject".
15
+
- If the action is "finish", but the full task is not done yet, you should
16
+
continue to delegate to one of the agents below to until the full task is finished.
17
+
- If the action is "reject", it means the delegated agent is not capable of the
18
+
task you send to. You should revisit the input you send to the delegate, and consider
19
+
whether any other delegate would be able to solve the task. If you cannot find
20
+
a proper delegate agent, or the delegate attempts keep failing, call the `reject`
21
+
action.
22
+
14
23
## Agents
15
24
16
25
### CoderAgent
@@ -62,6 +71,9 @@ They are time-ordered, with your most recent action at the bottom.
62
71
63
72
[[{"source": "user", "action": "message", "args": {"content": "Fix typos in bad.txt. Do not ask me for confirmation at any point.", "wait_for_response": false}}, {"observation": "null", "content": "", "extras": {}}]]
64
73
74
+
If the last item in the history is an error, you should try to fix it. If you
75
+
cannot fix it, call the `reject` action.
76
+
65
77
## Available Actions
66
78
* `delegate` - send a task to another agent from the list provided. Arguments:
67
79
* `agent` - the agent to which the task is delegated. MUST match a name in the list of agents provided.
@@ -70,6 +82,9 @@ They are time-ordered, with your most recent action at the bottom.
70
82
* `finish` - if you're absolutely certain that you've completed your task, use the finish action to stop working. Arguments:
71
83
* `outputs` - a dictionary representing the outputs of your task, if any
72
84
85
+
* `reject` - reject the task. Arguments:
86
+
* `outputs` - a dictionary with only a `reason` attribute
87
+
73
88
74
89
## Format
75
90
Your response MUST be in JSON format. It must be an object, and it must contain two fields:
Copy file name to clipboardExpand all lines: tests/integration/mock/ManagerAgent/test_edits/prompt_004.log
+1-1
Original file line number
Diff line number
Diff line change
@@ -52,7 +52,7 @@ as well as observations you've made. This only includes the MOST RECENT
52
52
actions and observations--more may have happened before that.
53
53
They are time-ordered, with your most recent action at the bottom.
54
54
55
-
[[{"source": "agent", "action": "read", "args": {"path": "./bad.txt", "start": 0, "end": -1, "thought": ""}}, {"source": "agent", "observation": "read", "content": "This is a stupid typoo.\nReally?\nNo mor typos!\nEnjoy!\n", "extras": {"path": "./bad.txt"}}], [{"source": "agent", "action": "message", "args": {"content": "The content of 'bad.txt' has been read. The next step is to identify and fix the typos.", "wait_for_response": false}}, {"observation": "null", "content": "", "extras": {}}]]
55
+
[[{"source": "agent", "action": "read", "args": {"path": "./bad.txt", "start": 0, "end": -1, "thought": ""}}, {"source": "agent", "observation": "read", "content": "This is a stupid typoo.\nReally?\nNo mor typos!\nEnjoy!\n", "extras": {"path": "./bad.txt"}}], [{"source": "agent", "action": "message", "args": {"content": "The content of 'bad.txt' has been read. The following typos have been identified:\n1. 'typoo' should be 'typo'\n2. 'mor' should be 'more'\nI will now proceed to fix these typos and overwrite the file with the corrected content.", "wait_for_response": false}}, {"observation": "null", "content": "", "extras": {}}]]
56
56
57
57
## Format
58
58
Your response MUST be in JSON format. It must be an object, and it must contain two fields:
Copy file name to clipboardExpand all lines: tests/integration/mock/ManagerAgent/test_edits/prompt_005.log
+1-1
Original file line number
Diff line number
Diff line change
@@ -52,7 +52,7 @@ as well as observations you've made. This only includes the MOST RECENT
52
52
actions and observations--more may have happened before that.
53
53
They are time-ordered, with your most recent action at the bottom.
54
54
55
-
[[{"source": "agent", "action": "read", "args": {"path": "./bad.txt", "start": 0, "end": -1, "thought": ""}}, {"source": "agent", "observation": "read", "content": "This is a stupid typoo.\nReally?\nNo mor typos!\nEnjoy!\n", "extras": {"path": "./bad.txt"}}], [{"source": "agent", "action": "message", "args": {"content": "The content of 'bad.txt' has been read. The next step is to identify and fix the typos.", "wait_for_response": false}}, {"observation": "null", "content": "", "extras": {}}], [{"source": "agent", "action": "write", "args": {"path": "./bad.txt", "content": "This is a stupid typo.\nReally?\nNo more typos!\nEnjoy!\n", "start": 0, "end": -1, "thought": ""}}, {"source": "agent", "observation": "write", "content": "", "extras": {"path": "./bad.txt"}}]]
55
+
[[{"source": "agent", "action": "read", "args": {"path": "./bad.txt", "start": 0, "end": -1, "thought": ""}}, {"source": "agent", "observation": "read", "content": "This is a stupid typoo.\nReally?\nNo mor typos!\nEnjoy!\n", "extras": {"path": "./bad.txt"}}], [{"source": "agent", "action": "message", "args": {"content": "The content of 'bad.txt' has been read. The following typos have been identified:\n1. 'typoo' should be 'typo'\n2. 'mor' should be 'more'\nI will now proceed to fix these typos and overwrite the file with the corrected content.", "wait_for_response": false}}, {"observation": "null", "content": "", "extras": {}}], [{"source": "agent", "action": "write", "args": {"path": "./bad.txt", "content": "This is a stupid typo.\nReally?\nNo more typos!\nEnjoy!\n", "start": 0, "end": -1, "thought": ""}}, {"source": "agent", "observation": "write", "content": "", "extras": {"path": "./bad.txt"}}]]
56
56
57
57
## Format
58
58
Your response MUST be in JSON format. It must be an object, and it must contain two fields:
"content": "The content of 'bad.txt' has been read. The next step is to identify and fix the typos.",
4
+
"content": "The content of 'bad.txt' has been read. The following typos have been identified:\n1. 'typoo' should be 'typo'\n2. 'mor' should be 'more'\nI will now proceed to fix these typos and overwrite the file with the corrected content.",
0 commit comments