Skip to content

Migrate multi-line-bash-related sandbox tests into runtime tests and fix multi-line issue #3128

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 44 commits into from
Jul 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
7085512
Remove global config from memory
neubig Jul 24, 2024
1eba89b
Remove runtime global config
neubig Jul 24, 2024
fc9ff0c
Remove from storage
neubig Jul 24, 2024
ef3acfd
Remove global config
neubig Jul 24, 2024
26bf203
Fix event stream tests
neubig Jul 24, 2024
4a21346
Fix sandbox issue
neubig Jul 25, 2024
ac20826
Change config
neubig Jul 25, 2024
dba4790
Removed transferred tests
neubig Jul 25, 2024
444653e
Add swe env box
neubig Jul 26, 2024
048637d
Merge branch 'main' of https://github.com/OpenDevin/OpenDevin into ne…
neubig Jul 26, 2024
2d938f4
Fixes on testing
neubig Jul 26, 2024
2fc9e46
Fixed some tests
neubig Jul 26, 2024
d2f1e62
Merge with stashed changes
xingyaoww Jul 26, 2024
25510b2
Fix typing
neubig Jul 26, 2024
c15b4ad
Merge commit '2fc9e46aa2a6dd4068b637a00091a7f0cd713f8d' into xw/migra…
xingyaoww Jul 26, 2024
001db22
Fix ipython test
neubig Jul 26, 2024
5130598
Revive function
neubig Jul 26, 2024
3fbd4b2
Make temp_dir fixture
neubig Jul 26, 2024
532dc31
Merge commit '3fbd4b2e95b7b97d816ecfe15eccf2756bbec061' into xw/migra…
xingyaoww Jul 26, 2024
f3dd4c9
Remove test to avoid circular import
neubig Jul 26, 2024
5bfb9fb
Merge commit '1f6e86c932e16914b94a673f22fc791cd8d4d58c' into xw/migra…
xingyaoww Jul 26, 2024
00a265a
Merge branch 'main' of https://github.com/OpenDevin/OpenDevin into ne…
neubig Jul 26, 2024
0472e21
Merge commit '00a265a04295387ec356cdfba6058f294840848b' into xw/migra…
xingyaoww Jul 26, 2024
1e4b5e1
Merge commit '6c4cce01a7e825536a94009f09c7f54f0fa15cfc' into xw/migra…
xingyaoww Jul 26, 2024
af9c67f
fix eventstream filestore for test_runtime
xingyaoww Jul 26, 2024
006e64e
fix parse arg issue that cause integration test to fail
xingyaoww Jul 26, 2024
db81b08
Merge branch 'main' into neubig/remove_remaining_global_config
xingyaoww Jul 26, 2024
0aa6f93
support swebench pull from custom namespace
xingyaoww Jul 26, 2024
5e96ffc
Merge commit '0aa6f931af019fed827babdf402cccc7ddf10f09' into xw/migra…
xingyaoww Jul 26, 2024
ddb1cc5
Merge commit 'db81b08aceb6f2e08dbf0176bd9efc75e0133ed1' into xw/migra…
xingyaoww Jul 26, 2024
403eb17
Merge commit '3301beffecb25548c4c4cf72738b1bbb145b98a3' into xw/migra…
xingyaoww Jul 26, 2024
be4696f
add back simple tests for runtime
xingyaoww Jul 26, 2024
1cb22e0
move multi-line bash tests to test_runtime;
xingyaoww Jul 26, 2024
5a04b14
add testcase to handle PS2 prompt
xingyaoww Jul 26, 2024
e69c041
use bashlex for bash parsing to handle multi-line commands;
xingyaoww Jul 27, 2024
94cfe89
Merge commit 'f07280153aafe3531edef3d63cbe4d108ca9933a' into xw/migra…
xingyaoww Jul 27, 2024
01773d1
revert ghcr runtime change
xingyaoww Jul 27, 2024
4054fd7
fix url test
xingyaoww Jul 27, 2024
fe2e216
split_bash_commands rewrite; added test_bash_parsing.py unit test
tobitege Jul 27, 2024
235b87f
now including the test_bash_parsing.py file...
tobitege Jul 27, 2024
cf11399
fix wrong import in unit test
tobitege Jul 27, 2024
c2dd6e0
Merge commit 'cf1139950dd27d19a20420c4005fe84ef1af5426' into xw/migra…
xingyaoww Jul 27, 2024
13e70eb
move split commands test to `test_bash_parsing`
xingyaoww Jul 27, 2024
7fddfad
fix jupyter here doc for server runtime;
xingyaoww Jul 27, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion opendevin/core/logger.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
from termcolor import colored

DISABLE_COLOR_PRINTING = False
DEBUG = False
DEBUG = os.getenv('DEBUG', 'False').lower() in ['true', '1', 'yes']

ColorType = Literal[
'red',
Expand Down
30 changes: 26 additions & 4 deletions opendevin/runtime/client/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@
Plugin,
)
from opendevin.runtime.server.files import insert_lines, read_lines
from opendevin.runtime.utils import split_bash_commands

app = FastAPI()

Expand Down Expand Up @@ -79,14 +80,23 @@ def _init_bash_shell(self, work_dir: str) -> None:
r'\[PEXPECT_BEGIN\] ([a-z0-9_-]*)@([a-zA-Z0-9.-]*):(.+) \[PEXPECT_END\]'
)

self.shell.sendline(f'export PS1="{self.__bash_PS1}"')
self.shell.sendline(f'export PS1="{self.__bash_PS1}"; export PS2=""')
self.shell.expect(self.__bash_expect_regex)

self.shell.sendline(f'cd {work_dir}')
self.shell.expect(self.__bash_expect_regex)

def _get_bash_prompt(self):
ps1 = self.shell.after

# begin at the last occurence of '[PEXPECT_BEGIN]'.
# In multi-line bash commands, the prompt will be repeated
# and the matched regex captures all of them
# - we only want the last one (newest prompt)
_begin_pos = ps1.rfind('[PEXPECT_BEGIN]')
if _begin_pos != -1:
ps1 = ps1[_begin_pos:]

# parse the ps1 to get username, hostname, and working directory
matched = re.match(self.__bash_expect_regex, ps1)
assert (
Expand All @@ -102,7 +112,7 @@ def _get_bash_prompt(self):
prompt += '$'
return prompt + ' '

def _execute_bash(self, command, keep_prompt: bool = True) -> tuple[str, int]:
def _execute_bash(self, command: str, keep_prompt: bool = True) -> tuple[str, int]:
logger.debug(f'Executing command: {command}')
self.shell.sendline(command)
self.shell.expect(self.__bash_expect_regex)
Expand All @@ -129,10 +139,22 @@ async def run_action(self, action) -> Observation:

async def run(self, action: CmdRunAction) -> CmdOutputObservation:
try:
output, exit_code = self._execute_bash(action.command)
commands = split_bash_commands(action.command)
all_output = ''
for command in commands:
output, exit_code = self._execute_bash(command)
if all_output:
# previous output already exists with prompt "user@hostname:working_dir #""
# we need to add the command to the previous output,
# so model knows the following is the output of another action)
all_output = all_output.rstrip() + ' ' + command + '\r\n'

all_output += str(output) + '\r\n'
if exit_code != 0:
break
return CmdOutputObservation(
command_id=-1,
content=str(output),
content=all_output.rstrip('\r\n'),
command=action.command,
exit_code=exit_code,
)
Expand Down
14 changes: 11 additions & 3 deletions opendevin/runtime/client/runtime.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ def __init__(
# TODO: We can switch to aiodocker when `get_od_sandbox_image` is updated to use aiodocker
self.docker_client: docker.DockerClient = self._init_docker_client()
self.container_image = (
config.sandbox.container_image
self.config.sandbox.container_image
if container_image is None
else container_image
)
Expand Down Expand Up @@ -103,7 +103,7 @@ def _init_docker_client() -> docker.DockerClient:
async def _init_container(
self,
sandbox_workspace_dir: str,
mount_dir: str,
mount_dir: str | None = None,
plugins: list[PluginRequirement] | None = None,
):
try:
Expand All @@ -124,6 +124,14 @@ async def _init_container(
else:
port_mapping = {f'{self._port}/tcp': self._port}

if mount_dir is not None:
volumes = {mount_dir: {'bind': sandbox_workspace_dir, 'mode': 'rw'}}
else:
logger.warn(
'Mount dir is not set, will not mount the workspace directory to the container.'
)
volumes = None

container = self.docker_client.containers.run(
self.container_image,
command=(
Expand All @@ -139,7 +147,7 @@ async def _init_container(
name=self.container_name,
detach=True,
environment={'DEBUG': 'true'} if self.config.debug else None,
volumes={mount_dir: {'bind': sandbox_workspace_dir, 'mode': 'rw'}},
volumes=volumes,
)
logger.info(f'Container started. Server url: {self.api_url}')
return container
Expand Down
4 changes: 2 additions & 2 deletions opendevin/runtime/runtime.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,13 +33,13 @@
from opendevin.storage import FileStore


def _default_env_vars(config: SandboxConfig) -> dict[str, str]:
def _default_env_vars(sandbox_config: SandboxConfig) -> dict[str, str]:
ret = {}
for key in os.environ:
if key.startswith('SANDBOX_ENV_'):
sandbox_key = key.removeprefix('SANDBOX_ENV_')
ret[sandbox_key] = os.environ[key]
if config.enable_auto_lint:
if sandbox_config.enable_auto_lint:
ret['ENABLE_AUTO_LINT'] = 'true'
return ret

Expand Down
2 changes: 1 addition & 1 deletion opendevin/runtime/server/runtime.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ async def run(self, action: CmdRunAction) -> Observation:

async def run_ipython(self, action: IPythonRunCellAction) -> Observation:
self._run_command(
("cat > /tmp/opendevin_jupyter_temp.py <<'EOL'\n" f'{action.code}\n' 'EOL'),
f"cat > /tmp/opendevin_jupyter_temp.py <<'EOL'\n{action.code}\nEOL"
)

# run the code
Expand Down
128 changes: 45 additions & 83 deletions opendevin/runtime/utils/bash.py
Original file line number Diff line number Diff line change
@@ -1,87 +1,49 @@
def split_bash_commands(commands):
# States
NORMAL = 0
IN_SINGLE_QUOTE = 1
IN_DOUBLE_QUOTE = 2
IN_HEREDOC = 3

state = NORMAL
heredoc_trigger = None
result = []
current_command: list[str] = []

i = 0
while i < len(commands):
char = commands[i]

if state == NORMAL:
if char == "'":
state = IN_SINGLE_QUOTE
elif char == '"':
state = IN_DOUBLE_QUOTE
elif char == '\\':
# Check if this is escaping a newline
if i + 1 < len(commands) and commands[i + 1] == '\n':
i += 1 # Skip the newline
# Continue with the next line as part of the same command
i += 1 # Move to the first character of the next line
continue
elif char == '\n':
if not heredoc_trigger and current_command:
result.append(''.join(current_command).strip())
current_command = []
elif char == '<' and commands[i : i + 2] == '<<':
# Detect heredoc
state = IN_HEREDOC
i += 2 # Skip '<<'
while commands[i] == ' ':
i += 1
start = i
while commands[i] not in [' ', '\n']:
i += 1
heredoc_trigger = commands[start:i]
current_command.append(commands[start - 2 : i]) # Include '<<'
continue # Skip incrementing i at the end of the loop
current_command.append(char)

elif state == IN_SINGLE_QUOTE:
current_command.append(char)
if char == "'" and commands[i - 1] != '\\':
state = NORMAL
import bashlex

elif state == IN_DOUBLE_QUOTE:
current_command.append(char)
if char == '"' and commands[i - 1] != '\\':
state = NORMAL
from opendevin.core.logger import opendevin_logger as logger

elif state == IN_HEREDOC:
current_command.append(char)
if (
char == '\n'
and heredoc_trigger
and commands[i + 1 : i + 1 + len(heredoc_trigger) + 1]
== heredoc_trigger + '\n'
):
# Check if the next line starts with the heredoc trigger followed by a newline
i += (
len(heredoc_trigger) + 1
) # Move past the heredoc trigger and newline
current_command.append(
heredoc_trigger + '\n'
) # Include the heredoc trigger and newline
result.append(''.join(current_command).strip())
current_command = []
heredoc_trigger = None
state = NORMAL
continue

i += 1

# Add the last command if any
if current_command:
result.append(''.join(current_command).strip())

# Remove any empty strings from the result
result = [cmd for cmd in result if cmd]

def split_bash_commands(commands):
try:
parsed = bashlex.parse(commands)
except bashlex.errors.ParsingError as e:
logger.error(
f'Failed to parse bash commands\n[input]: {commands}\n[error]: {e}'
)
# If parsing fails, return the original commands
return [commands]

result: list[str] = []
last_end = 0

for node in parsed:
start, end = node.pos

# Include any text between the last command and this one
if start > last_end:
between = commands[last_end:start]
logger.debug(f'BASH PARSING between: {between}')
if result:
result[-1] += between.rstrip()
elif between.strip():
# THIS SHOULD NOT HAPPEN
result.append(between.rstrip())

# Extract the command, preserving original formatting
command = commands[start:end].rstrip()
logger.debug(f'BASH PARSING command: {command}')
result.append(command)

last_end = end

# Add any remaining text after the last command to the last command
remaining = commands[last_end:].rstrip()
logger.debug(f'BASH PARSING remaining: {remaining}')
if last_end < len(commands) and result:
result[-1] += remaining
logger.debug(f'BASH PARSING result[-1] += remaining: {result[-1]}')
elif last_end < len(commands):
if remaining:
result.append(remaining)
logger.debug(f'BASH PARSING result.append(remaining): {result[-1]}')
return result
15 changes: 13 additions & 2 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ pathspec = "^0.12.1"
google-cloud-aiplatform = "*"
grep-ast = "0.3.2"
tree-sitter = "0.21.3"
bashlex = "^0.18"

[tool.poetry.group.llama-index.dependencies]
llama-index = "*"
Expand Down Expand Up @@ -72,6 +73,7 @@ reportlab = "*"
[tool.coverage.run]
concurrency = ["gevent"]


[tool.poetry.group.runtime.dependencies]
jupyterlab = "*"
notebook = "*"
Expand Down Expand Up @@ -105,6 +107,7 @@ ignore = ["D1"]
[tool.ruff.lint.pydocstyle]
convention = "google"


[tool.poetry.group.evaluation.dependencies]
streamlit = "*"
whatthepatch = "*"
Expand Down
Loading