Skip to content

Enable CodeAct agents with browsing, and also enable arbitrary BrowserGym action support #1807

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
May 15, 2024

Conversation

frankxu2004
Copy link
Collaborator

@frankxu2004
Copy link
Collaborator Author

frankxu2004 commented May 15, 2024

Example run log:

$ poetry run python ./opendevin/core/main.py         -i 50         -t "Browse https://www.google.com and show me the contents. Do not ask me for con
firmation at any point."         -c CodeActAgent
Running agent CodeActAgent (model: gpt-3.5-turbo-1106), with task: "Browse https://www.google.com and show me the contents. Do not ask me for confirmation at any point."
06:53:59 - opendevin:INFO: llm.py:115 - Initializing LLM with model: gpt-3.5-turbo-1106
06:53:59 - opendevin:INFO: ssh_box.py:141 - SSHBox is running as opendevin user with USER_ID=1020 in the sandbox
06:53:59 - opendevin:INFO: ssh_box.py:554 - Container stopped
06:53:59 - opendevin:WARNING: ssh_box.py:566 - Using port forwarding for Mac OS. Server started by OpenDevin will not be accessible from the host machine at the moment. See https://github.com/OpenDevin/OpenDevin/issues/897 for more information.
06:53:59 - opendevin:INFO: ssh_box.py:539 - Mounting workspace directory: /projects/ogma3/fangzhex/OpenDevin
06:53:59 - opendevin:INFO: ssh_box.py:575 - Mounting volumes: {'/projects/ogma3/fangzhex/OpenDevin': {'bind': '/workspace', 'mode': 'rw'}, '/tmp/cache': {'bind': '/home/opendevin/.cache', 'mode': 'rw'}}
06:53:59 - opendevin:INFO: ssh_box.py:539 - Mounting workspace directory: /projects/ogma3/fangzhex/OpenDevin
06:54:01 - opendevin:INFO: ssh_box.py:586 - Container started
06:54:02 - opendevin:INFO: ssh_box.py:602 - waiting for container to start: 1, container status: running
06:54:04 - opendevin:INFO: ssh_box.py:301 - Connecting to opendevin@localhost via ssh. If you encounter any issues, you can try `ssh -v -p 39025 opendevin@localhost` with the password '882094a8-9585-461a-b378-b568692afdf8' and report the issue on GitHub. If you started OpenDevin with `docker run`, you should try `ssh -v -p 39025 opendevin@localhost` with the password '882094a8-9585-461a-b378-b568692afdf8 on the host machine (where you started the container).
06:54:06 - opendevin:INFO: browser_env.py:38 - Starting browser env...
06:54:06 - opendevin:INFO: mixin.py:28 - Copied files from [/projects/ogma3/fangzhex/OpenDevin/opendevin/runtime/plugins/jupyter] to [/opendevin/plugins/jupyter] inside sandbox.
06:54:06 - opendevin:INFO: mixin.py:36 - Initializing plugin [jupyter] by executing [/opendevin/plugins/jupyter/setup.sh] in the sandbox.
06:54:16 - opendevin:INFO: browser_env.py:51 - Browser env started.
06:54:45 - opendevin:INFO: mixin.py:44 - Plugin jupyter initialized successfully.
06:54:45 - opendevin:INFO: mixin.py:28 - Copied files from [/projects/ogma3/fangzhex/OpenDevin/opendevin/runtime/plugins/swe_agent_commands] to [/opendevin/plugins/swe_agent_commands] inside sandbox.
06:54:45 - opendevin:INFO: mixin.py:36 - Initializing plugin [swe_agent_commands] by executing [/opendevin/plugins/swe_agent_commands/setup_default.sh] in the sandbox.
06:54:48 - opendevin:INFO: mixin.py:44 - Plugin swe_agent_commands initialized successfully.
06:54:48 - opendevin:INFO: mixin.py:52 - Sourced ~/.bashrc successfully
06:54:48 - opendevin:INFO: agent_controller.py:148 - Setting agent(CodeActAgent) state from AgentState.LOADING to AgentState.RUNNING
06:54:48 - opendevin:INFO: agent_controller.py:148 - Setting agent(CodeActAgent) state from AgentState.RUNNING to AgentState.RUNNING


==============
STEP 0

06:54:49 - opendevin:INFO: codeact_agent.py:281 - Cost: 0.00 USD | Accumulated Cost: 0.00 USD
06:54:49 - ACTION
BrowseInteractiveAction(browser_actions='goto("https://www.google.com")', thought='Sure! Let me browse https://www.google.com and show you the contents:', action='browse_interactive')


==============
STEP 1

06:54:57 - opendevin:INFO: codeact_agent.py:281 - Cost: 0.00 USD | Accumulated Cost: 0.01 USD
06:54:57 - ACTION
MessageAction(content='The contents of https://www.google.com include various options such as Gmail, Images, Sign in, and different "I\'m Feeling" options. It also includes links to "Privacy," "Terms," "Settings," "Search settings," "Advanced search," "Your data in Search," "Search history," "Search help," and "Send feedback." Additionally, there are options for "Dark theme" and "Google apps." If you have any further requests or questions, feel free to ask!', wait_for_response=True, action='message')
06:54:57 - opendevin:INFO: agent_controller.py:148 - Setting agent(CodeActAgent) state from AgentState.RUNNING to AgentState.AWAITING_USER_INPUT
Request user input >> 


@dataclass
class BrowseInteractiveAction(Action):
browser_actions: str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I only see 'goto' browser_actions in this pr, is there any other type? Since if only one browser_action, then maybe BrowseInteractiveAction is just the same as BrowseURLAction ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're planning on adding more actions later, but want to do it a little bit at the time as we validate that this doesn't hurt accuracy.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, this is true, I only showed to the model in prompts about the usage of goto(). However, there are many other actions that can be supported: https://github.com/ServiceNow/BrowserGym/blob/main/core/src/browsergym/core/action/highlevel.py

Now this is to lay ground to this support in the future

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, got it.

Copy link
Contributor

@neubig neubig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for this!

@frankxu2004
Copy link
Collaborator Author

#1469 solved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants