-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Agent, LLM] Make sure codeact agent produce message in u/a/u/a order #3193
Conversation
# handle error if the message is the SAME role as the previous message | ||
# litellm.exceptions.BadRequestError: litellm.BadRequestError: OpenAIException - Error code: 400 - {'detail': 'Only supports u/a/u/a/u...'} | ||
# there should not have two consecutive messages from the same role | ||
if messages and messages[-1]['role'] == message['role']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't test, but it looks like this will also merge the example message (in_context_sample) with the first user message. If that's the case, will it have some effect on evals, since the first user message is the task?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the original implementation of CodeAct is actually merging this two message together - then it get separated at some moment. I think we had a "-- end of example --" message there, so i think we should be fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely possible, it looks separated, and we have also a hard-coded "NOW, LET'S START!" after that.
Sorry, I'm running SWE-bench at the moment on another branch, as soon as it's done I can try this a little bit. But I think merging this is okay before, it doesn't have to wait. Up to you, this is a small thing we can tweak if needed, IMHO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, this makes a lot of sense, and I'm happy to see this! I have been watching this exact issue, but I wasn't sure if or how much models actually care.
(Good to keep in mind for the memory work, too, since currently we're playing with an extra user message.)
Unintended early approval, but it's okay. I always notice this, so I'll test it anyway. 😅 |
@enyst We can leave this open a bit longer and potentially run some eval / quick tests |
@xingyaoww It works, the agent doesn't seem confused. We could do the same for CodeActSWE? |
@enyst good point! Let's see if the tests will pass |
What is the problem that this fixes or functionality that this introduces? Does it fix any open issues?
Serving framework for OSS model typically uses a template that require messages comes in user/assistant/user/assitant order, but not user/user or assistant/assistant:
# litellm.exceptions.BadRequestError: litellm.BadRequestError: OpenAIException - Error code: 400 - {'detail': 'Only supports u/a/u/a/u...'}
Give a summary of what the PR does, explaining any non-trivial design decisions
This PR concatenates the message with the same role.
Other references