-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenDevin checkout to reproduce CodeActAgent 1.3 ( 25% accuracy on SWE-Bench Lite) #2319
Comments
@mihaela-bornea are you looking for the exact OpenDevin version running that CodeAct version? Or would just running the latest 0.6 work for you? |
No, this issue can be closed. Question was actually for execution of CodeAct as a standalone feature, without OpenDevin. Answered in Discord, that that is not supported/possible. |
@mamoodi That is correct. I was looking for the OpenDevin version running that CodeAct version. More precisely the OpenDevin code for CodeAct 1.3, the one you used to obtain the 25%. Right now, if I clone the repo I will run CodeAct v1.5. What past version do I need to check out for CodeAct 1.3? |
Version 1.3 came out with this commit: If you click through the files you're interested in, you e.g. can use the "Blame" feature on GitHub to find when a line, like the one with the VERSION in it, last changed. From there you can traverse back, just one option to find this. |
OK, thanks. I was not sure if I should use the first or the last the last commit with CodeAct 1.3. Just to confirm, the 25% result was with the commit you posted above. (a84d19f ) |
Hmm... actually, I think I had a misunderstanding:
|
OK, so what commit do you recommend I use to get as close as possible to your 25% result? |
If you intend to run evaluations, then the CodeActSWEAgent commits are the ones to use. |
I actually intend to run inference with OpenDevin CodeAct and obtain the same patches as the ones in your 25% experiment. When I evaluate these patches, I expect to obtain 25%. I am not concerned about the evaluation script as I understand how to run evaluation. |
HI @tobitege ( cc @neubig ) - what do we need to do to reproduce the We want to do the following:
Thanks! |
|
I see- and the 26.3 is with hints? |
@avisil Yes - that's correct! I'm going to close this for now -- feel free to re-open if there's any further questions! |
Thanks so much for your responses! |
BTW, feel free to join our slack (https://bit.ly/OpenDevin-Slack) if you have any other questions! We have a #swe-bench-eval channel for quick discussion there :) |
Describe your question
Hello. Can you please clarify the version of the code that I need to checkout to run CodeAct 1.3 ?
I would like to reproduce the results in here
Thanks
The text was updated successfully, but these errors were encountered: