Description
What problem or use case are you trying to solve?
Right now, the backend mainly relies on ssh
to communicate with sandbox
, which does not fit too well with our current Event Stream-based communication. The requirement of the backend to know the existence of "ssh" makes thing much harder to support different runtimes/docker images (#1387, e.g., we need to automatically install sshd
if user bring their own docker images without it - and it can get tricky very quickly since sshd
may need to be installed differently across different linux distributions) and creating hosted version (#1086).
Describe the UX of the solution you'd like
I imagine our next step in architecture will be able to support arbitrary docker image sandbox/runtime by creating one piece of software called od-runtime-client
and automatically installing it into the user-provided sandbox (if it wasn't installed already - this is already partially done in #2101).
Then, at the entry point of each docker sandbox, od-runtime-client
will be started:
- it will connect to backend's event stream (
via WebSocketwell it is currently implemented as REST API), handle event stream communication to/from other sources. Resolved by Add websocket runtime and od-client-runtime #2603- For example,
CmdRunAction
can be directly stream from the agent and received byod-runtime-client
, andod-runtime-client
can maintains apexcept
session that directly interact with/bin/bash
to execute that command. - This will allows us to get rid of the SSH requirement altogether
- For example,
- be docker-image agnostic to support arbitrary user-provided sandbox images
- Partially done in Make plugins sandbox-agnostic #2101
- Partially done in [Arch] Implement EventStream Runtime Client with Jupyter Support using Agnostic Sandbox #2879, require further testing
- We need to shrink down the size of our dependency required to install in the
runtime-client
(now we need to install everything opendevin needed), so we can build any user-provide images faster (Remove torch dependency if possible #690 is related): [Runtime] Reduce dependency to speed up CI and reduce image size #3195
- A browser can be maintained as a plugin for
od-runtime-client
, so we can get rid of the current browsing running on the backend (cc @frankxu2004). Done in [Arch]EventStreamRuntime
supports browser #2899- It will also make it easier for (multimodal) agent to use the browser to browse some local multimedia file (e.g., goto url
file://image.jpg
) - It will also be more natural to integrate browser primitives (e.g.,
goto_url
) intoagentskills
library when they are in the same container.
- It will also make it easier for (multimodal) agent to use the browser to browse some local multimedia file (e.g., goto url
- maintain an IPython +
agentskills
library with theod-runtime-client
. Done in [Arch] Implement EventStream Runtime Client with Jupyter Support using Agnostic Sandbox #2879 - we might also consider merging all the existing plugins into
od-runtime-client
. [Arch] Implement EventStream Runtime Client with Jupyter Support using Agnostic Sandbox #2879 [ ] replace(no plan for me)<execute_bash>
with aexecute_bash
agentskills
.- Extensively tests EventStream runtime to make sure it is in feature parity as our existing
ServerRuntime
:- Add tests for sandbox: Migrate multi-line-bash-related sandbox tests into runtime tests and fix multi-line issue #3128
- Add integration tests: [Arch] Support integration tests using EventStream Runtime #3184
- Make sure
EventStreamRuntime
works across all the evaluation suite. Run SWE-Bench to ensure there's no performance degradation: [Refactor, Evaluation] Refactor and clean up evaluation harness to remove global config and use EventStreamRuntime #3230 - Deprecate and remove
ServerRuntime
, and switch toEventStreamRuntime
by default: (arch) Switch default runtime to EventStream Runtime #3271
Do you have thoughts on the technical implementation?
Due to the diversity of user-provided docker images we might need to support, I propose we use some package manager like miniforge that is already multi-platform to maintain the environment (miniforge can install different python version, and even maintains its own glibc
version to circumvent some system-level restriction, e.g., glibc
version too old) and dependencies of od-runtime-client
. So the workflow of user-bring docker sandbox would be (essentially what we implemented in #2101):
- Detect if a user brings an image that comes with already installed
od-runtime-client
- If not, we create a temporary
Dockerfile
,FROM user-provided-docker
, then build that image with a suffix_od
- Then we use
${SANDBOX_CONTAINER_IMAGE}_od
to start the sandbox, and assume every dependencies is met.
Describe alternatives you've considered
Additional context