-
Notifications
You must be signed in to change notification settings - Fork 6.1k
Integration test github action #5076
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
A potential fix has been generated and a draft PR #5077 has been created. Please review the changes. |
enyst
pushed a commit
to enyst/playground
that referenced
this issue
Nov 23, 2024
enyst
added a commit
to enyst/playground
that referenced
this issue
Nov 25, 2024
* Fix issue All-Hands-AI#5076: Integration test github action * Update integration-runner.yml * Update integration-runner.yml * update variables * use haiku * use base url * fix report name * Fix pr #8: Integration tests (openhands fix issue 5076) * Revert "Fix pr #8: Integration tests (openhands fix issue 5076)" This reverts commit dcd4681. * Fix pr #8: Integration tests (openhands fix issue 5076) * use haiku explicitly, in results too * remove duplicate * Update .github/workflows/integration-runner.yml * Revert "Update .github/workflows/integration-runner.yml" This reverts commit 7e7200e. * funny space * Fix pr #8: Integration tests (openhands fix issue 5076) * artifact fix * clean up remote runtimes * clean up runtimes more aggressively - a bit unexpected though * Fix pr #8: Integration tests (openhands fix issue 5076) * fix type issue that was preventing checking results * try with waiting time * add eval notes * increase timeouts * try with CI local builds * fix eval output * set debug * fix tests! * fix outputs * keep details in logs, not github comment * tweak schedule * lint-y --------- Co-authored-by: openhands <[email protected]>
enyst
added a commit
that referenced
this issue
Nov 27, 2024
Co-authored-by: Engel Nyst <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
The eval-runner workflow from the .github/workflows directory is too big. Read it all carefully, and note how it's doing two different things: integration test evaluation and SWE-Bench evaluation. Let's split it:
IMPORTANT:
The text was updated successfully, but these errors were encountered: