Skip to content

[Agent, LLM] Make sure codeact agent produce message in u/a/u/a order #3193

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Aug 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion agenthub/codeact_agent/codeact_agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -222,7 +222,13 @@ def _get_messages(self, state: State) -> list[dict[str, str]]:

# add regular message
if message:
messages.append(message)
# handle error if the message is the SAME role as the previous message
# litellm.exceptions.BadRequestError: litellm.BadRequestError: OpenAIException - Error code: 400 - {'detail': 'Only supports u/a/u/a/u...'}
# there should not have two consecutive messages from the same role
if messages and messages[-1]['role'] == message['role']:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't test, but it looks like this will also merge the example message (in_context_sample) with the first user message. If that's the case, will it have some effect on evals, since the first user message is the task?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the original implementation of CodeAct is actually merging this two message together - then it get separated at some moment. I think we had a "-- end of example --" message there, so i think we should be fine?

Copy link
Collaborator

@enyst enyst Jul 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely possible, it looks separated, and we have also a hard-coded "NOW, LET'S START!" after that.

Sorry, I'm running SWE-bench at the moment on another branch, as soon as it's done I can try this a little bit. But I think merging this is okay before, it doesn't have to wait. Up to you, this is a small thing we can tweak if needed, IMHO.

messages[-1]['content'] += '\n\n' + message['content']
else:
messages.append(message)

# the latest user message is important:
# we want to remind the agent of the environment constraints
Expand Down
8 changes: 7 additions & 1 deletion agenthub/codeact_swe_agent/codeact_swe_agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,13 @@ def _get_messages(self, state: State) -> list[dict[str, str]]:

# add regular message
if message:
messages.append(message)
# handle error if the message is the SAME role as the previous message
# litellm.exceptions.BadRequestError: litellm.BadRequestError: OpenAIException - Error code: 400 - {'detail': 'Only supports u/a/u/a/u...'}
# there should not have two consecutive messages from the same role
if messages and messages[-1]['role'] == message['role']:
messages[-1]['content'] += '\n\n' + message['content']
else:
messages.append(message)

# the latest user message is important:
# we want to remind the agent of the environment constraints
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -396,8 +396,6 @@ The server is running on port 5000 with PID 126. You can access the list of numb

NOW, LET'S START!

----------

Browse localhost:8000, and tell me the ultimate answer to life. Do not ask me for confirmation at any point.

ENVIRONMENT REMINDER: You have 14 turns left to complete the task. When finished reply with <finish></finish>
Original file line number Diff line number Diff line change
Expand Up @@ -396,8 +396,6 @@ The server is running on port 5000 with PID 126. You can access the list of numb

NOW, LET'S START!

----------

Browse localhost:8000, and tell me the ultimate answer to life. Do not ask me for confirmation at any point.

----------
Expand Down
2 changes: 0 additions & 2 deletions tests/integration/mock/CodeActAgent/test_edits/prompt_001.log
Original file line number Diff line number Diff line change
Expand Up @@ -396,8 +396,6 @@ The server is running on port 5000 with PID 126. You can access the list of numb

NOW, LET'S START!

----------

Fix typos in bad.txt. Do not ask me for confirmation at any point.

ENVIRONMENT REMINDER: You have 14 turns left to complete the task. When finished reply with <finish></finish>
2 changes: 0 additions & 2 deletions tests/integration/mock/CodeActAgent/test_edits/prompt_002.log
Original file line number Diff line number Diff line change
Expand Up @@ -396,8 +396,6 @@ The server is running on port 5000 with PID 126. You can access the list of numb

NOW, LET'S START!

----------

Fix typos in bad.txt. Do not ask me for confirmation at any point.

----------
Expand Down
2 changes: 0 additions & 2 deletions tests/integration/mock/CodeActAgent/test_edits/prompt_003.log
Original file line number Diff line number Diff line change
Expand Up @@ -396,8 +396,6 @@ The server is running on port 5000 with PID 126. You can access the list of numb

NOW, LET'S START!

----------

Fix typos in bad.txt. Do not ask me for confirmation at any point.

----------
Expand Down
2 changes: 0 additions & 2 deletions tests/integration/mock/CodeActAgent/test_edits/prompt_004.log
Original file line number Diff line number Diff line change
Expand Up @@ -396,8 +396,6 @@ The server is running on port 5000 with PID 126. You can access the list of numb

NOW, LET'S START!

----------

Fix typos in bad.txt. Do not ask me for confirmation at any point.

----------
Expand Down
2 changes: 0 additions & 2 deletions tests/integration/mock/CodeActAgent/test_edits/prompt_005.log
Original file line number Diff line number Diff line change
Expand Up @@ -396,8 +396,6 @@ The server is running on port 5000 with PID 126. You can access the list of numb

NOW, LET'S START!

----------

Fix typos in bad.txt. Do not ask me for confirmation at any point.

----------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -396,8 +396,6 @@ The server is running on port 5000 with PID 126. You can access the list of numb

NOW, LET'S START!

----------

Use Jupyter IPython to write a text file containing 'hello world' to '/workspace/test.txt'. Do not ask me for confirmation at any point.

ENVIRONMENT REMINDER: You have 14 turns left to complete the task. When finished reply with <finish></finish>
Original file line number Diff line number Diff line change
Expand Up @@ -396,8 +396,6 @@ The server is running on port 5000 with PID 126. You can access the list of numb

NOW, LET'S START!

----------

Use Jupyter IPython to write a text file containing 'hello world' to '/workspace/test.txt'. Do not ask me for confirmation at any point.

----------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -396,8 +396,6 @@ The server is running on port 5000 with PID 126. You can access the list of numb

NOW, LET'S START!

----------

Use Jupyter IPython to write a text file containing 'hello world' to '/workspace/test.txt'. Do not ask me for confirmation at any point.

----------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -396,8 +396,6 @@ The server is running on port 5000 with PID 126. You can access the list of numb

NOW, LET'S START!

----------

Install and import pymsgbox==1.0.9 and print it's version in /workspace/test.txt. Do not ask me for confirmation at any point.

ENVIRONMENT REMINDER: You have 14 turns left to complete the task. When finished reply with <finish></finish>
Original file line number Diff line number Diff line change
Expand Up @@ -396,8 +396,6 @@ The server is running on port 5000 with PID 126. You can access the list of numb

NOW, LET'S START!

----------

Install and import pymsgbox==1.0.9 and print it's version in /workspace/test.txt. Do not ask me for confirmation at any point.

----------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -396,8 +396,6 @@ The server is running on port 5000 with PID 126. You can access the list of numb

NOW, LET'S START!

----------

Install and import pymsgbox==1.0.9 and print it's version in /workspace/test.txt. Do not ask me for confirmation at any point.

----------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -396,8 +396,6 @@ The server is running on port 5000 with PID 126. You can access the list of numb

NOW, LET'S START!

----------

Install and import pymsgbox==1.0.9 and print it's version in /workspace/test.txt. Do not ask me for confirmation at any point.

----------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -396,8 +396,6 @@ The server is running on port 5000 with PID 126. You can access the list of numb

NOW, LET'S START!

----------

Write a shell script 'hello.sh' that prints 'hello'. Do not ask me for confirmation at any point.

ENVIRONMENT REMINDER: You have 14 turns left to complete the task. When finished reply with <finish></finish>
Original file line number Diff line number Diff line change
Expand Up @@ -396,8 +396,6 @@ The server is running on port 5000 with PID 126. You can access the list of numb

NOW, LET'S START!

----------

Write a shell script 'hello.sh' that prints 'hello'. Do not ask me for confirmation at any point.

----------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -396,8 +396,6 @@ The server is running on port 5000 with PID 126. You can access the list of numb

NOW, LET'S START!

----------

Write a shell script 'hello.sh' that prints 'hello'. Do not ask me for confirmation at any point.

----------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -396,8 +396,6 @@ The server is running on port 5000 with PID 126. You can access the list of numb

NOW, LET'S START!

----------

Write a shell script 'hello.sh' that prints 'hello'. Do not ask me for confirmation at any point.

----------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -587,8 +587,6 @@ exit

NOW, LET'S START!

----------

Fix typos in bad.txt. Do not ask me for confirmation at any point.

ENVIRONMENT REMINDER: You have 14 turns left to complete the task.
Original file line number Diff line number Diff line change
Expand Up @@ -587,8 +587,6 @@ exit

NOW, LET'S START!

----------

Fix typos in bad.txt. Do not ask me for confirmation at any point.

----------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -587,8 +587,6 @@ exit

NOW, LET'S START!

----------

Fix typos in bad.txt. Do not ask me for confirmation at any point.

----------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -587,8 +587,6 @@ exit

NOW, LET'S START!

----------

Fix typos in bad.txt. Do not ask me for confirmation at any point.

----------
Expand Down
60 changes: 29 additions & 31 deletions tests/integration/mock/CodeActSWEAgent/test_edits/prompt_005.log
Original file line number Diff line number Diff line change
Expand Up @@ -587,8 +587,6 @@ exit

NOW, LET'S START!

----------

Fix typos in bad.txt. Do not ask me for confirmation at any point.

----------
Expand All @@ -601,13 +599,13 @@ open_file("bad.txt")
----------

OBSERVATION:
[File: /workspace/bad.txt (4 lines total)]
(this is the beginning of the file)
1|This is a stupid typoo.
2|Really?
3|No mor typos!
4|Enjoy!
(this is the end of the file)
[File: /workspace/bad.txt (4 lines total)]
(this is the beginning of the file)
1|This is a stupid typoo.
2|Really?
3|No mor typos!
4|Enjoy!
(this is the end of the file)


----------
Expand All @@ -629,12 +627,12 @@ edit_file_by_replace(
----------

OBSERVATION:
[No exact match found in bad.txt for
```
This is a stupid typoo.
No mor typos!
```
]
[No exact match found in bad.txt for
```
This is a stupid typoo.
No mor typos!
```
]


----------
Expand All @@ -656,14 +654,14 @@ edit_file_by_replace(
----------

OBSERVATION:
[File: /workspace/bad.txt (4 lines total after edit)]
(this is the beginning of the file)
1|This is a stupid typo.
2|Really?
3|No mor typos!
4|Enjoy!
(this is the end of the file)
[File updated (edited at line 1). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
[File: /workspace/bad.txt (4 lines total after edit)]
(this is the beginning of the file)
1|This is a stupid typo.
2|Really?
3|No mor typos!
4|Enjoy!
(this is the end of the file)
[File updated (edited at line 1). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]


----------
Expand All @@ -680,14 +678,14 @@ edit_file_by_replace(
----------

OBSERVATION:
[File: /workspace/bad.txt (4 lines total after edit)]
(this is the beginning of the file)
1|This is a stupid typo.
2|Really?
3|No more typos!
4|Enjoy!
(this is the end of the file)
[File updated (edited at line 3). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]
[File: /workspace/bad.txt (4 lines total after edit)]
(this is the beginning of the file)
1|This is a stupid typo.
2|Really?
3|No more typos!
4|Enjoy!
(this is the end of the file)
[File updated (edited at line 3). Please review the changes and make sure they are correct (correct indentation, no duplicate lines, etc). Edit the file again if necessary.]


ENVIRONMENT REMINDER: You have 10 turns left to complete the task.
Original file line number Diff line number Diff line change
Expand Up @@ -587,8 +587,6 @@ exit

NOW, LET'S START!

----------

Use Jupyter IPython to write a text file containing 'hello world' to '/workspace/test.txt'. Do not ask me for confirmation at any point.

ENVIRONMENT REMINDER: You have 14 turns left to complete the task.
Original file line number Diff line number Diff line change
Expand Up @@ -587,8 +587,6 @@ exit

NOW, LET'S START!

----------

Use Jupyter IPython to write a text file containing 'hello world' to '/workspace/test.txt'. Do not ask me for confirmation at any point.

----------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -587,8 +587,6 @@ exit

NOW, LET'S START!

----------

Install and import pymsgbox==1.0.9 and print it's version in /workspace/test.txt. Do not ask me for confirmation at any point.

ENVIRONMENT REMINDER: You have 14 turns left to complete the task.
Original file line number Diff line number Diff line change
Expand Up @@ -587,8 +587,6 @@ exit

NOW, LET'S START!

----------

Install and import pymsgbox==1.0.9 and print it's version in /workspace/test.txt. Do not ask me for confirmation at any point.

----------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -587,8 +587,6 @@ exit

NOW, LET'S START!

----------

Install and import pymsgbox==1.0.9 and print it's version in /workspace/test.txt. Do not ask me for confirmation at any point.

----------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -587,8 +587,6 @@ exit

NOW, LET'S START!

----------

Write a shell script 'hello.sh' that prints 'hello'. Do not ask me for confirmation at any point.

ENVIRONMENT REMINDER: You have 14 turns left to complete the task.
Original file line number Diff line number Diff line change
Expand Up @@ -587,8 +587,6 @@ exit

NOW, LET'S START!

----------

Write a shell script 'hello.sh' that prints 'hello'. Do not ask me for confirmation at any point.

----------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -587,8 +587,6 @@ exit

NOW, LET'S START!

----------

Write a shell script 'hello.sh' that prints 'hello'. Do not ask me for confirmation at any point.

----------
Expand Down
Loading