Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(fix) EventStreamRuntime: fix config passing on init (fixes test_runtime.py errors) #3233

Merged
merged 1 commit into from
Aug 3, 2024

Conversation

tobitege
Copy link
Collaborator

@tobitege tobitege commented Aug 2, 2024

What is the problem that this fixes or functionality that this introduces? Does it fix any open issues?

  • EventStreamRuntime in its initialization did not copy over the passed config values to its own self.config, thus the base runtime did not get e.g. a changed value like run_as_devin as it happens in test_runtime.py (see: test_ipython_multi_user).
  • Added more logging to runtime
  • Changed max retry wait from 600 to 60 seconds as it otherwise would wait a painfully long time

Give a summary of what the PR does, explaining any non-trivial design decisions

In test_runtime.py several tests will be called twice for run_as_devin of True and False (root).
Even though the _load_runtime fixture instantiates a new EventStreamRuntime each time, the underlying base Runtime is only instantiated once and thus kept the same config values after the first instantiation.

Example: if test test_ipython_multi_user was first called with run_as_devin being True, the second run with fixture value of False would still cause the command in the container to be run with opendevin instead of root.

@tobitege tobitege added backend Related to backend test labels Aug 2, 2024
@tobitege tobitege requested a review from xingyaoww August 2, 2024 22:04
@tobitege
Copy link
Collaborator Author

tobitege commented Aug 2, 2024

Example log to illustrate the error in test_runtime.py (added extra line breaks for readability):
EventStreamRuntime config is output for each test, but Runtime config: only for the first test, thus retaining original values and the 2nd test is still run internally with opendevin instead of root and failing.

Running test: test_ipython_multi_user[EventStreamRuntime-True]
########################################################################
23:24:38 - opendevin:WARNING: test_runtime.py:138 - `ghcr.io/opendevin/sandbox:main` is not an od_runtime image. Will use `ubuntu:22.04` as the container image for testing.

23:24:38 - opendevin:INFO: test_runtime.py:141 - EventStreamRuntime config: AppConfig(llms={'llm': LLMConfig(model='gpt-4o', api_key='******', base_url=None, api_version=None, embedding_model='local', 
embedding_base_url=None, embedding_deployment_name=None, aws_access_key_id='******', 
aws_secret_access_key='******', aws_region_name=None, num_retries=10, retry_multiplier=2, retry_min_wait=3, 
retry_max_wait=300, timeout=None, max_message_chars=10000, temperature=0, top_p=0.5, custom_llm_provider=None, 
max_input_tokens=None, max_output_tokens=None, input_cost_per_token=None, output_cost_per_token=None, 
ollama_base_url=None, drop_params=None, on_cancel_requested_fn=None)}, agents={'agent': 
AgentConfig(memory_enabled=False, memory_max_threads=2, llm_config=None)}, default_agent='CodeActAgent', 
sandbox=SandboxConfig(env={}, box_type='ssh', container_image='ghcr.io/opendevin/sandbox:main', user_id=1000, 
timeout=120, enable_auto_lint=False, use_host_network=True, initialize_plugins=True, update_source_code=False), 
runtime='server', file_store='memory', file_store_path='/tmp/file_store', 
workspace_base='/tmp/pytest-of-tobias/pytest-18/test_runtime0', 
workspace_mount_path='/tmp/pytest-of-tobias/pytest-18/test_runtime0', 
workspace_mount_path_in_sandbox='/workspace', workspace_mount_rewrite=None, cache_dir='/tmp/cache', 
run_as_devin=True, confirmation_mode=False, max_iterations=100, max_budget_per_task=None, e2b_api_key='******', 
use_host_network=True, ssh_hostname='localhost', disable_color=False, persist_sandbox=False, ssh_port=63710, 
ssh_password='******', jwt_secret='******', debug=False, enable_cli_session=False, file_uploads_max_file_size_mb=0, 
file_uploads_restrict_file_types=False, file_uploads_allowed_extensions=['.*']

23:24:38 - opendevin:INFO: runtime.py:81 - Runtime config: AppConfig(llms={'llm': LLMConfig(model='gpt-4o', 
api_key='******', base_url=None, api_version=None, embedding_model='local', embedding_base_url=None, 
embedding_deployment_name=None, aws_access_key_id='******', aws_secret_access_key='******', aws_region_name=None, 
num_retries=10, retry_multiplier=2, retry_min_wait=3, retry_max_wait=300, timeout=None, max_message_chars=10000, 
temperature=0, top_p=0.5, custom_llm_provider=None, max_input_tokens=None, max_output_tokens=None, 
input_cost_per_token=None, output_cost_per_token=None, ollama_base_url=None, drop_params=None, 
on_cancel_requested_fn=None)}, agents={'agent': AgentConfig(memory_enabled=False, memory_max_threads=2, 
llm_config=None)}, default_agent='CodeActAgent', sandbox=SandboxConfig(env={}, box_type='ssh', 
container_image='ghcr.io/opendevin/sandbox:main', user_id=1000, timeout=120, enable_auto_lint=False, 
use_host_network=True, initialize_plugins=True, update_source_code=False), runtime='server', file_store='memory', 
file_store_path='/tmp/file_store', workspace_base='/tmp/pytest-of-tobias/pytest-18/test_runtime0', 
workspace_mount_path='/tmp/pytest-of-tobias/pytest-18/test_runtime0', workspace_mount_path_in_sandbox='/workspace', 
workspace_mount_rewrite=None, cache_dir='/tmp/cache', run_as_devin=True, confirmation_mode=False, max_iterations=100, 
max_budget_per_task=None, e2b_api_key='******', use_host_network=True, ssh_hostname='localhost', disable_color=False, 
persist_sandbox=False, ssh_port=63710, ssh_password='******', jwt_secret='******', debug=False, enable_cli_session=False, 
file_uploads_max_file_size_mb=0, file_uploads_restrict_file_types=False, file_uploads_allowed_extensions=['.*']

23:24:38 - opendevin:INFO: runtime_build.py:199 - New image name: od_runtime:od_v0.8.3_image_ubuntu_tag_22.04
23:24:39 - opendevin:INFO: runtime_build.py:211 - Cannot pull image od_runtime:od_v0.8.3_image_ubuntu_tag_22.04 directly
23:24:39 - opendevin:INFO: runtime_build.py:216 - Image od_runtime:od_v0.8.3_image_ubuntu_tag_22.04 exists
23:24:39 - opendevin:INFO: runtime_build.py:223 - No image build done (not updating source code)
23:24:39 - opendevin:INFO: runtime.py:118 - Starting container with image: od_runtime:od_v0.8.3_image_ubuntu_tag_22.04 and name: opendevin-sandbox-test1af3039d-0101-4d30-a330-c7e38a9e896f
23:24:39 - opendevin:WARNING: runtime.py:131 - Using host network mode. If you are using MacOS, please make sure you have the latest version of Docker Desktop and enabled host network feature: https://docs.docker.com/network/drivers/host/#docker-desktop
23:24:39 - opendevin:INFO: runtime.py:142 - Mount dir: /workspace
23:24:39 - opendevin:INFO: runtime.py:149 - run_as_devin: `True`
23:24:39 - opendevin:INFO: runtime.py:169 - Container started. Server url: http://localhost:51079
23:24:39 - opendevin:INFO: runtime.py:92 - Container initialized with plugins: ['agent_skills', 'jupyter']
23:24:39 - opendevin:INFO: runtime.py:95 - Container initialized with env vars: None
23:24:40 - ACTION
**IPythonRunCellAction**
CODE:
import os; print(os.environ['USER'])
23:24:40 - opendevin:INFO: runtime.py:285 - Awaiting session
23:24:41 - opendevin:INFO: runtime.py:192 - Reconnecting session
23:24:45 - opendevin:INFO: runtime.py:192 - Reconnecting session
23:24:49 - opendevin:INFO: runtime.py:192 - Reconnecting session
23:24:57 - opendevin:INFO: runtime.py:192 - Reconnecting session
23:24:57 - opendevin:INFO: runtime.py:291 - Executing command
23:24:57 - OBSERVATION
**IPythonRunCellObservation**
opendevin

23:24:57 - ACTION
**IPythonRunCellAction**
CODE:
import os; print(os.getcwd())
23:24:57 - opendevin:INFO: runtime.py:285 - Awaiting session
23:24:58 - opendevin:INFO: runtime.py:192 - Reconnecting session
23:24:58 - opendevin:INFO: runtime.py:291 - Executing command
23:24:59 - OBSERVATION
**IPythonRunCellObservation**
/workspace

23:24:59 - ACTION
**IPythonRunCellAction**
CODE:
with open('test.txt', 'w') as f: f.write('Hello, world!')
23:24:59 - opendevin:INFO: runtime.py:285 - Awaiting session
23:25:00 - opendevin:INFO: runtime.py:192 - Reconnecting session
23:25:00 - opendevin:INFO: runtime.py:291 - Executing command
23:25:00 - OBSERVATION
**IPythonRunCellObservation**
[Code executed successfully with no output]
23:25:00 - ACTION
**CmdRunAction (source=None)**
COMMAND:
ls -alh test.txt
23:25:00 - opendevin:INFO: runtime.py:285 - Awaiting session
23:25:01 - opendevin:INFO: runtime.py:192 - Reconnecting session
23:25:01 - opendevin:INFO: runtime.py:291 - Executing command
23:25:01 - OBSERVATION
**CmdOutputObservation (source=None, exit code=0)**
-rw-r--r-- 1 opendevin root 13 Aug  2 21:25 test.txt

opendevin@docker-desktop:/workspace $
23:25:01 - ACTION
**CmdRunAction (source=None)**
COMMAND:
rm -rf test
23:25:01 - opendevin:INFO: runtime.py:285 - Awaiting session
23:25:02 - opendevin:INFO: runtime.py:192 - Reconnecting session
23:25:02 - opendevin:INFO: runtime.py:291 - Executing command
23:25:02 - OBSERVATION
**CmdOutputObservation (source=None, exit code=0)**

opendevin@docker-desktop:/workspace $
PASSED
tests/unit/test_runtime.py::test_ipython_multi_user[EventStreamRuntime-False]
########################################################################
Running test: test_ipython_multi_user[EventStreamRuntime-False]
########################################################################
23:25:04 - opendevin:WARNING: test_runtime.py:138 - `ghcr.io/opendevin/sandbox:main` is not an od_runtime image. Will use `ubuntu:22.04` as the container image for testing.
23:25:04 - opendevin:INFO: test_runtime.py:141 - EventStreamRuntime config: AppConfig(llms={'llm': 
LLMConfig(model='gpt-4o', api_key='******', base_url=None, api_version=None, embedding_model='local', 
embedding_base_url=None, embedding_deployment_name=None, aws_access_key_id='******', 
aws_secret_access_key='******', aws_region_name=None, num_retries=10, retry_multiplier=2, retry_min_wait=3, 
retry_max_wait=300, timeout=None, max_message_chars=10000, temperature=0, top_p=0.5, custom_llm_provider=None, 
max_input_tokens=None, max_output_tokens=None, input_cost_per_token=None, output_cost_per_token=None, 
ollama_base_url=None, drop_params=None, on_cancel_requested_fn=None)}, agents={'agent': 
AgentConfig(memory_enabled=False, memory_max_threads=2, llm_config=None)}, default_agent='CodeActAgent', 
sandbox=SandboxConfig(env={}, box_type='ssh', container_image='ghcr.io/opendevin/sandbox:main', user_id=1000, 
timeout=120, enable_auto_lint=False, use_host_network=True, initialize_plugins=True, update_source_code=False), 
runtime='server', file_store='memory', file_store_path='/tmp/file_store', 
workspace_base='/tmp/pytest-of-tobias/pytest-18/test_runtime1', 
workspace_mount_path='/tmp/pytest-of-tobias/pytest-18/test_runtime1', 
workspace_mount_path_in_sandbox='/workspace', workspace_mount_rewrite=None, cache_dir='/tmp/cache', 
run_as_devin=False, confirmation_mode=False, max_iterations=100, max_budget_per_task=None, e2b_api_key='******', 
use_host_network=True, ssh_hostname='localhost', disable_color=False, persist_sandbox=False, ssh_port=63710, 
ssh_password='******', jwt_secret='******', debug=False, enable_cli_session=False, file_uploads_max_file_size_mb=0, 
file_uploads_restrict_file_types=False, file_uploads_allowed_extensions=['.*']

23:25:04 - opendevin:INFO: runtime_build.py:199 - New image name: od_runtime:od_v0.8.3_image_ubuntu_tag_22.04
23:25:05 - opendevin:INFO: runtime_build.py:211 - Cannot pull image od_runtime:od_v0.8.3_image_ubuntu_tag_22.04 directly
23:25:05 - opendevin:INFO: runtime_build.py:216 - Image od_runtime:od_v0.8.3_image_ubuntu_tag_22.04 exists
23:25:05 - opendevin:INFO: runtime_build.py:223 - No image build done (not updating source code)
23:25:05 - opendevin:INFO: runtime.py:118 - Starting container with image: od_runtime:od_v0.8.3_image_ubuntu_tag_22.04 and name: opendevin-sandbox-testbf54ba0e-9c69-4ef6-8e13-fc75e30c395c
23:25:05 - opendevin:WARNING: runtime.py:131 - Using host network mode. If you are using MacOS, please make sure you have the latest version of Docker Desktop and enabled host network feature: https://docs.docker.com/network/drivers/host/#docker-desktop
23:25:05 - opendevin:INFO: runtime.py:142 - Mount dir: /workspace
23:25:05 - opendevin:INFO: runtime.py:149 - run_as_devin: `True`
23:25:05 - opendevin:INFO: runtime.py:169 - Container started. Server url: http://localhost:38383
23:25:05 - opendevin:INFO: runtime.py:92 - Container initialized with plugins: ['agent_skills', 'jupyter']
23:25:05 - opendevin:INFO: runtime.py:95 - Container initialized with env vars: None
23:25:06 - ACTION
**IPythonRunCellAction**
CODE:
import os; print(os.environ['USER'])
23:25:06 - opendevin:INFO: runtime.py:285 - Awaiting session
23:25:07 - opendevin:INFO: runtime.py:192 - Reconnecting session
23:25:11 - opendevin:INFO: runtime.py:192 - Reconnecting session
23:25:15 - opendevin:INFO: runtime.py:192 - Reconnecting session
23:25:15 - opendevin:INFO: runtime.py:291 - Executing command
23:25:15 - OBSERVATION
**IPythonRunCellObservation**
opendevin

FAILED

Copy link
Collaborator

@xingyaoww xingyaoww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

@@ -49,6 +50,7 @@ def __init__(
plugins: list[PluginRequirement] | None = None,
container_image: str | None = None,
):
self.config = copy.deepcopy(config)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is actually done it is parent constructor super().__init__() if I'm not remembering wrong?

Copy link
Collaborator

@xingyaoww xingyaoww Aug 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nvm i think i got it? Very good catch!!

EventStreamRuntime config is output for each test, but Runtime config: only for the first test, thus retaining original values and the 2nd test is still run internally with opendevin instead of root and failing.

I just don't get where does the second run get the config from...

@xingyaoww xingyaoww merged commit 1166b0e into All-Hands-AI:main Aug 3, 2024
2 checks passed
@tobitege tobitege deleted the client-runtime-init branch August 3, 2024 04:07
tobitege added a commit to tobitege/OpenHands that referenced this pull request Aug 3, 2024
xingyaoww pushed a commit that referenced this pull request Aug 4, 2024
* ServerRuntime: config copy in init

* revert #3233 but more logging

* get_box_classes: reset order back to previous version

* 3 logging commands switched to debug (were info)

* runtimes debug output of config on initialization

* removed unneeded logger message from _init_container
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Related to backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants