Skip to content

feat (backend): Add support for MCP servers natively via CodeActAgent #7637

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 40 commits into from
Apr 10, 2025

Conversation

ducphamle2
Copy link
Contributor

@ducphamle2 ducphamle2 commented Apr 1, 2025

  • This change is worth documenting at https://docs.all-hands.dev/
  • [] Include this change in the Release Notes. If checked, you must provide an end-user friendly description for your change below

End-user friendly description of the problem this fixes or functionality that this introduces.

Add support for MCP servers via CodeActAgent

TODOs:

  • Write more tests
  • Fix to pass tests
  • Add documentation

Give a summary of what the PR does, explaining any non-trivial design decisions.

This PR enables arbitrary MCP server usage (mainly focused and tested with SSE MCP Servers) by:

Providing MCP servers in config.toml like below:

[mcp-sse]
mcp_servers = ["http://localhost:8000/sse"]

[mcp-stdio]
commands = ["npx"]
args = [["-y", "@oraichain/orai-mcp"]]
  1. When an agent is initialized, it will first pick up the list of MCPs, and integrate them into its list of tools.
  2. When a user request is received, the LLM picks the appropriate tool and forwards it to the runtime via the /execute_action endpoint.
  3. The runtime server will connect with the matching MCP server and call it using the action.arguments
  4. The MCP Server's response will be turned into a MCPObservation instance, and fed into the LLM messages again for the next step.
  • This design will allow arbitrary MCP servers to be integrated into OpenHands seamlessly without breaking the existing architecture. It will co-exist with the microagents, as two features are different by design.

  • We can customize how we handle MCP Observations, especially for browser-related MCPs like playwright-mcp.

  • Some potential limitations (should be addressed in different PRs):

  • Potential overlap with built-in tools. However, MCP servers can also have overlapping tool names, so this problem depends on the MCP Server devs, not particularly ours.

  • Need more tests.

  • No MCP Authentication yet. This can be made via a different PR.

  • Hasn't utilized the multi-agentic flow yet (different PR).


Link of any specific issues this addresses.

#5781
#7547

This is a different approach compared to this PR: #7620. I am more than happy to collaborate with @ryanhoangt to merge and re-use each other's code if possible.

@ducphamle2 ducphamle2 changed the title Add support for MCP servers via CodeActAgent Add support for MCP servers natively via CodeActAgent Apr 1, 2025
@ducphamle2 ducphamle2 force-pushed the feat/pr-mcp-upstream branch 2 times, most recently from 93ea7a2 to 82e2500 Compare April 2, 2025 05:48
@ducphamle2
Copy link
Contributor Author

Updating to pass the workflow tests

@ducphamle2 ducphamle2 changed the title Add support for MCP servers natively via CodeActAgent feat: Add support for MCP servers natively via CodeActAgent Apr 2, 2025
@ducphamle2 ducphamle2 changed the title feat: Add support for MCP servers natively via CodeActAgent feat (backend): Add support for MCP servers natively via CodeActAgent Apr 2, 2025
Copy link
Collaborator

@ryanhoangt ryanhoangt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the contribution, I took a look at the PR and left some questions, the approach does make sense to me! The main concern from my side is there are a lot of new types being created, it'd be great if we could find a way to simplify them.

Copy link
Collaborator

@xingyaoww xingyaoww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome to see! Let me know if you need anything from my side to merge this PR!

Would love to use MCP directly to test sequential thinking: #7643

Copy link
Contributor

@neubig neubig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello! First, thanks a bunch for the contribution!

Just to clarify, this requires us to have an additional method for customization for MCP servers, right? This contrasts with the approach in #7620, which allows us to re-use our current method for customization, namely micro-agents.

Personally I would prefer keeping the user experience simple and not introduce an additional method of customization. But if there are benefits of this approach over the approach in #7620, I'd be happy to discuss them.

@ducphamle2
Copy link
Contributor Author

This is awesome to see! Let me know if you need anything from my side to merge this PR!

Would love to use MCP directly to test sequential thinking: #7643

@xingyaoww I'm not entirely familiar with the config flow from the frontend -> backend yet. If possible, users can choose their MCP servers via the frontend, like the LLM API Key, but we still keep the MCP config in the backend as a fallback. If you could help with that It would be great!

@xingyaoww
Copy link
Collaborator

@ducphamle2 sounds good! I think we can focus on getting the backend code that uses "config.toml" to work first, and then we can start adding it into the settings page.

I think this PR might be in good shape for merge after some comments from @ryanhoangt is addressed ❤️

@xingyaoww
Copy link
Collaborator

Personally I would prefer keeping the user experience simple and not introduce an additional method of customization. But if there are benefits of this approach over the approach in #7620, I'd be happy to discuss them.

@neubig Based on my understanding, I think this PR lays down the foundation work for getting the MCP server configured and having those actions executed. I think it is still feasible for us to do something similar to #7620 where we put some MCP config into some microagent, and pass those config along to backend setting? So we are not actually losing anything? WDYT?

@ducphamle2
Copy link
Contributor Author

Hello! First, thanks a bunch for the contribution!

Just to clarify, this requires us to have an additional method for customization for MCP servers, right? This contrasts with the approach in #7620, which allows us to re-use our current method for customization, namely micro-agents.

Personally I would prefer keeping the user experience simple and not introduce an additional method of customization. But if there are benefits of this approach over the approach in #7620, I'd be happy to discuss them.

More than happy to contribute! Yes, it is a different approach compared to using micro-agents.

I believe the main benefit of this approach is that the agents will decide when to use the MCP tools instead of explicitly stating how and when to trigger them via microagent knowledge.
We also don't need to have a mapping between microagents and MCP Servers.

We can also do like @xingyaoww said, by adding specific MCP servers for the microagent to use, which is a win-win imo

@xingyaoww
Copy link
Collaborator

I asked OpenHands to summarize some offline discussion I had with @neubig - hope it provide some context :)

Hi @ducphamle2 and everyone,

Thanks for the great work on this PR! I wanted to provide some context from our Slack discussion about the MCP integration approaches.

We've been discussing two different approaches for MCP integration:

  1. @ryanhoangt's PR (Add support for MCP servers #7620) - Using microagents to call MCP tools via a custom meta-tool
  2. This PR (feat (backend): Add support for MCP servers natively via CodeActAgent #7637) - Exposing MCP tools directly to the LLM as native tools

After discussion, we've decided to move forward with this PR's approach as it:

  • Treats MCP tools similar to built-in tools, which is how other MCP clients like Claude Desktop or Cursor use MCP servers
  • Makes it easier to add new tools like sequential thinking without changing much code
  • Gives the LLM more direct access to tools, potentially improving its ability to use them effectively

@neubig raised a concern about introducing an additional method of customization. I think we can address this by:

  1. First getting the backend code for MCP working with minimal changes
  2. Later, we can integrate this with the microagent system by having microagent configs feed into the MCP config

For next steps, I suggest we:

  1. Clean up the code based on the review comments:

    • Simplify the types and classes (use MCP SDK types where possible)
    • Rename MCPAgent to avoid confusion with OpenHands agents
    • Remove unused methods and files
    • Add more robust error handling for function calls
    • Remove MCP config from config.template.toml to avoid confusion (as @neubig suggested)
  2. Keep the changes minimal for this initial integration

  3. After this PR is merged, we can work on:

    • Adding a frontend UI for configuring MCP servers
    • Integrating with the microagent system
    • Adding support for more MCP features like authentication

I'm excited to see this merged so we can start using tools like sequential thinking in OpenHands! Let me know if you need any help with the changes.

@ducphamle2 ducphamle2 force-pushed the feat/pr-mcp-upstream branch from 2a76ff9 to eda6369 Compare April 2, 2025 19:21
@ducphamle2
Copy link
Contributor Author

ducphamle2 commented Apr 2, 2025

I asked OpenHands to summarize some offline discussion I had with @neubig - hope it provide some context :)

Hi @ducphamle2 and everyone,

Thanks for the great work on this PR! I wanted to provide some context from our Slack discussion about the MCP integration approaches.

We've been discussing two different approaches for MCP integration:

  1. @ryanhoangt's PR (Add support for MCP servers #7620) - Using microagents to call MCP tools via a custom meta-tool
  2. This PR (feat (backend): Add support for MCP servers natively via CodeActAgent #7637) - Exposing MCP tools directly to the LLM as native tools

After discussion, we've decided to move forward with this PR's approach as it:

  • Treats MCP tools similar to built-in tools, which is how other MCP clients like Claude Desktop or Cursor use MCP servers
  • Makes it easier to add new tools like sequential thinking without changing much code
  • Gives the LLM more direct access to tools, potentially improving its ability to use them effectively

@neubig raised a concern about introducing an additional method of customization. I think we can address this by:

  1. First getting the backend code for MCP working with minimal changes
  2. Later, we can integrate this with the microagent system by having microagent configs feed into the MCP config

For next steps, I suggest we:

  1. Clean up the code based on the review comments:

    • Simplify the types and classes (use MCP SDK types where possible)
    • Rename MCPAgent to avoid confusion with OpenHands agents
    • Remove unused methods and files
    • Add more robust error handling for function calls
    • Remove MCP config from config.template.toml to avoid confusion (as @neubig suggested)
  2. Keep the changes minimal for this initial integration

  3. After this PR is merged, we can work on:

    • Adding a frontend UI for configuring MCP servers
    • Integrating with the microagent system
    • Adding support for more MCP features like authentication

I'm excited to see this merged so we can start using tools like sequential thinking in OpenHands! Let me know if you need any help with the changes.

Thank you for the acknowledgement and all the comments! We'll resolve the issues asap (in the next couple of days probably) so we can merge this PR, will let you know if I need help from your end!

@ducphamle2 ducphamle2 force-pushed the feat/pr-mcp-upstream branch from f66cf4d to a4fe4fe Compare April 3, 2025 05:23
Copy link
Collaborator

@xingyaoww xingyaoww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work! It is already SO much cleaner 🙏 Left a few nit / questions

@ducphamle2 ducphamle2 force-pushed the feat/pr-mcp-upstream branch 2 times, most recently from a472380 to f78727b Compare April 6, 2025 02:51
@bensheed
Copy link

bensheed commented Apr 6, 2025

I am so excited for this feature!

@enyst
Copy link
Collaborator

enyst commented Apr 6, 2025

It's an awesome feature and a lot of work going into it, thank you for doing this @ducphamle2 !

Is the PR ready for review again?

@ducphamle2
Copy link
Contributor Author

@enyst no problem, glad I could help!

Yes it is ready for another round. Im eager to get this PR merged, as me and my team are also actively implementing custom features using MCP on the repo

Copy link
Collaborator

@xingyaoww xingyaoww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey great work!

I take a look at this PR this afternoon and tried to:

  • revert some unnecessary change
  • merge from main
  • refactor MCPStdioConfig so we can define new tools in a cleaner way in config.toml
  • break mcp.py into multiple files so hopefully it make things cleaner

This new change would allow us to define tools like this in config.toml which looks slightly cleaner:

[mcp-sse]
mcp_servers = ["http://localhost:8000/sse"]

[mcp-stdio.tools]
filesystem = { command = "npx", args = ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/dir"] }
brave-search = {command = "docker", args = ["run", "-i", "--rm", "-e", "BRAVE_API_KEY", "mcp/brave-search"], env = {"BRAVE_API_KEY" = "xxx"}}

I put those changes into this PR: oraichain#8, which will merge into your branch (this PR). Feel free to take a look and LMK if you have any concerns!


One primary concern I have with existing implementation is that for stdio MCP tools, the backend assume we have the environment setup for it, but often time we don't (e.g., see log below, the docker/npx command failed - so the tool initialization also failed).

17:57:28 - openhands:ERROR: client.py:119 - Error connecting to npx: [Errno 2] No such file or directory: 'npx'
Traceback (most recent call last):
  File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/mcp/client.py", line 107, in connect_stdio
    await asyncio.wait_for(connection_task, timeout=timeout)
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/tasks.py", line 520, in wait_for
    return await fut
           ^^^^^^^^^
  File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/mcp/client.py", line 129, in _connect_stdio_internal
    stdio_transport = await self.exit_stack.enter_async_context(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/contextlib.py", line 659, in enter_async_context
    result = await _enter(cm)
             ^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/contextlib.py", line 210, in __aenter__
    return await anext(self.gen)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xingyaow/Library/Caches/pypoetry/virtualenvs/openhands-ai-wqN7FI_O-py3.12/lib/python3.12/site-packages/mcp/client/stdio.py", line 100, in stdio_client
    process = await anyio.open_process(
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xingyaow/Library/Caches/pypoetry/virtualenvs/openhands-ai-wqN7FI_O-py3.12/lib/python3.12/site-packages/anyio/_core/_subprocesses.py", line 184, in open_process
    return await get_async_backend().open_process(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xingyaow/Library/Caches/pypoetry/virtualenvs/openhands-ai-wqN7FI_O-py3.12/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 2552, in open_process
    process = await asyncio.create_subprocess_exec(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/subprocess.py", line 224, in create_subprocess_exec
    transport, protocol = await loop.subprocess_exec(
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/base_events.py", line 1743, in subprocess_exec
    transport = await self._make_subprocess_transport(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/unix_events.py", line 211, in _make_subprocess_transport
    transp = _UnixSubprocessTransport(self, protocol, args, shell,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/base_subprocess.py", line 36, in __init__
    self._start(args=args, shell=shell, stdin=stdin, stdout=stdout,
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/unix_events.py", line 820, in _start
    self._proc = subprocess.Popen(
                 ^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/subprocess.py", line 1026, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/subprocess.py", line 1955, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'npx'
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/core/cli.py", line 208, in <module>
    loop.run_until_complete(main(loop))
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/base_events.py", line 674, in run_until_complete
    self.run_forever()
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/base_events.py", line 641, in run_forever
    self._run_once()
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/base_events.py", line 1986, in _run_once
    handle._run()
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/events.py", line 88, in _run
    self._context.run(self._callback, *self._args)
  File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/core/cli.py", line 113, in main
    mcp_tools = await fetch_mcp_tools_from_config(config.mcp)
  File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/mcp/utils.py", line 109, in fetch_mcp_tools_from_config
    mcp_clients = await create_mcp_clients(
  File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/mcp/utils.py", line 73, in create_mcp_clients
    await client.connect_stdio(
  File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/mcp/client.py", line 119, in connect_stdio
    logger.error(f'Error connecting to {command}: {str(e)}')
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/logging/__init__.py", line 1568, in error
    self._log(ERROR, msg, args, **kwargs)
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/logging/__init__.py", line 1684, in _log
    self.handle(record)

17:57:28 - openhands:INFO: utils.py:79 - Connected to MCP server via stdio with command npx
17:57:28 - openhands:INFO: utils.py:68 - Initializing MCP tool [brave-search] for [docker] with stdio connection...
17:57:28 - openhands:ERROR: client.py:119 - Error connecting to docker: [Errno 2] No such file or directory: 'docker'
Traceback (most recent call last):
  File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/mcp/client.py", line 107, in connect_stdio
    await asyncio.wait_for(connection_task, timeout=timeout)
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/tasks.py", line 520, in wait_for
    return await fut
           ^^^^^^^^^
  File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/mcp/client.py", line 129, in _connect_stdio_internal
    stdio_transport = await self.exit_stack.enter_async_context(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/contextlib.py", line 659, in enter_async_context
    result = await _enter(cm)
             ^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/contextlib.py", line 210, in __aenter__
    return await anext(self.gen)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xingyaow/Library/Caches/pypoetry/virtualenvs/openhands-ai-wqN7FI_O-py3.12/lib/python3.12/site-packages/mcp/client/stdio.py", line 100, in stdio_client
    process = await anyio.open_process(
              ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xingyaow/Library/Caches/pypoetry/virtualenvs/openhands-ai-wqN7FI_O-py3.12/lib/python3.12/site-packages/anyio/_core/_subprocesses.py", line 184, in open_process
    return await get_async_backend().open_process(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xingyaow/Library/Caches/pypoetry/virtualenvs/openhands-ai-wqN7FI_O-py3.12/lib/python3.12/site-packages/anyio/_backends/_asyncio.py", line 2552, in open_process
    process = await asyncio.create_subprocess_exec(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/subprocess.py", line 224, in create_subprocess_exec
    transport, protocol = await loop.subprocess_exec(
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/base_events.py", line 1743, in subprocess_exec
    transport = await self._make_subprocess_transport(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/unix_events.py", line 211, in _make_subprocess_transport
    transp = _UnixSubprocessTransport(self, protocol, args, shell,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/base_subprocess.py", line 36, in __init__
    self._start(args=args, shell=shell, stdin=stdin, stdout=stdout,
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/unix_events.py", line 820, in _start
    self._proc = subprocess.Popen(
                 ^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/subprocess.py", line 1026, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/subprocess.py", line 1955, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'docker'
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/core/cli.py", line 208, in <module>
    loop.run_until_complete(main(loop))
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/base_events.py", line 674, in run_until_complete
    self.run_forever()
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/base_events.py", line 641, in run_forever
    self._run_once()
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/base_events.py", line 1986, in _run_once
    handle._run()
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/asyncio/events.py", line 88, in _run
    self._context.run(self._callback, *self._args)
  File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/core/cli.py", line 113, in main
    mcp_tools = await fetch_mcp_tools_from_config(config.mcp)
  File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/mcp/utils.py", line 109, in fetch_mcp_tools_from_config
    mcp_clients = await create_mcp_clients(
  File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/mcp/utils.py", line 73, in create_mcp_clients
    await client.connect_stdio(
  File "/Users/xingyaow/Projects/All-Hands-AI/OpenHands/openhands/mcp/client.py", line 119, in connect_stdio
    logger.error(f'Error connecting to {command}: {str(e)}')
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/logging/__init__.py", line 1568, in error
    self._log(ERROR, msg, args, **kwargs)
  File "/opt/homebrew/Caskroom/mambaforge/base/envs/oh/lib/python3.12/logging/__init__.py", line 1684, in _log
    self.handle(record)

I'm start to think if we can simplify the implementation a lot if we:

  • stop support stdio-based server
  • and use tools like https://github.com/supercorp-ai/supergateway to convert any studio to SSE server
    But even this, I imagine we'll need to wait until the server is fully initialized before we can get the tool list, right?

@xingyaoww
Copy link
Collaborator

xingyaoww commented Apr 6, 2025

I think the main pain point is coming from: We need to know what the existing tools are BEFORE we initialize the runtime.

And if we want to support "mcp-stdio", it means we need to initialize the client first, which probably means we need to have stuff like "docker"/"npx" installed locally for the backend to use.

But lots of users are using OpenHands via docker run openhands - this means we need to configure "docker"/"npx" inside every OpenHands app docker we built, which doesn't seem realistic to me. Even if the user has npx/docker configured outside docker, they'll have to use Development workflow if they want the backend to use those stdio directly. This seems to require a lot of documentation and makes things overly complicated than they should, IMO.

I just tried https://github.com/supercorp-ai/supergateway locally and it seems to me that it is not super hard to use.

Maybe we can:

  1. Only support SSE-based MCP
  2. Add in documentation that user can use stdio MCP via tools like https://github.com/supercorp-ai/supergateway
  3. This also allow us to simplify the config a lot
  4. Also means better user experience, for example, user can just docker run OpenHands and provide a good SSE server endpoint point to host machine like http://host.docker.internal:8000/sse - and everything will just work out-of-box

We are just making an assumption that: the MCP SSE server should be ONLINE BEFORE the agent start. Which I don't think it is too bad?

@xingyaoww
Copy link
Collaborator

@ducphamle2 LMK what you think! I can also help with these changes by sending in PRs to your branch if that helps 🙏

@ducphamle2
Copy link
Contributor Author

ducphamle2 commented Apr 7, 2025

@xingyaoww I agree with removing the stdio server!. Tbh, I have never been a fan of stdio-based MCPs. I always use SSE MCP cuz it is very like Plug and Play, and the MCP client doesn't have to care about the MCP's runtime environment. I included stdio because I thought you would prefer it like Claude Desktop / Cursor supporting both standards.

The MCP team is cooking a new upgrade for the SSE, which is Streamable HTTP -> people will mostly use it instead of stdio IMO.
Lemme take a look at your PR. Thank you so much for helping!

@ducphamle2 ducphamle2 force-pushed the feat/pr-mcp-upstream branch from 930388b to ca7bd14 Compare April 7, 2025 01:47
@ducphamle2 ducphamle2 force-pushed the feat/pr-mcp-upstream branch from 9d006ca to 385f197 Compare April 8, 2025 21:51
@AtAFork
Copy link

AtAFork commented Apr 9, 2025

Quick thought/suggestion (although I don't know how difficult implementation of all this is):

A lot of the most popular MCP servers are Stdio and I think a lot of folks won't use open hands if they have to jump through another hoop to get them working in that way. Is there a way to integrate this tool below? I've been using it, and it's a brilliant way to organize MCP servers. MetaMCP itself is an MCP, so I think you would just need to handle the case where that is Stdio.

https://github.com/metatool-ai/metatool-app

@xingyaoww
Copy link
Collaborator

Hey @AtAFork, i do plan to add smth like mcp-hub (https://github.com/ravitemer/mcp-hub) that will handle both stdio and sse that live inside the runtime container -- this allow us to support both stdio and sse while the backend only assume support for SSE.
This will probably happen in future PR -- stay tuned!

@xingyaoww
Copy link
Collaborator

And thanks for recommending metamcp! https://github.com/metatool-ai/mcp-server-metamcp?tab=readme-ov-file#using-as-an-sse-server did a quick look here, and it seems they also support some sort of "SSE server" - but that's dependent on the MetaMCP "app" which is probably overkill for the OpenHands usecase for now.

But with MCP+SSE implemented in this PR, user can just start their own metamcp server with SSE, and then directly plugin that SSE link -- so there's actually no need to support stdio for metaMCP

@ducphamle2
Copy link
Contributor Author

Wow thanks for sharing the metamcp and mcp hub! I have been waiting for the official mcp registry from the mcp team, but it is taking them a bit long

Copy link
Collaborator

@xingyaoww xingyaoww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments should be non-blocking! Otherwise this code LGTM!

I can approve this PR after I get this to work locally

@xingyaoww
Copy link
Collaborator

@ducphamle2 Ok this is good to go!!

Just need this final PR into this branch oraichain#17 that:

  • Merge from main
  • Remove stale test
  • Refactor client connection in action_execution_client

I just tested locally, and things seem to be working!

[mcp]
mcp_servers = ["http://localhost:8000/sse"]
image image

@ducphamle2
Copy link
Contributor Author

@xingyaoww merged!

Copy link
Collaborator

@xingyaoww xingyaoww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super excited for this - Thanks a lot for your contribution @ducphamle2!

I imagine there are a few improvements we could make going forward:

  • simplify the MCP Config: we don't need another SSE config anymore since we only support SEE now
  • Improve the FE UI for MCP action/execution, as well as showing the current list of tools the agent has access to
  • Support STDIO by adding a default MCP server inside runtime

@ducphamle2
Copy link
Contributor Author

Thanks @xingyaoww, glad I could help!

On our main branch, we have improved the MCP UI significantly. If I have time, I will create another PR for that. If not, I trust that the OpenHands community can easily improve the UI!

@xingyaoww xingyaoww enabled auto-merge (squash) April 9, 2025 18:23
@xingyaoww xingyaoww disabled auto-merge April 9, 2025 19:52
@xingyaoww xingyaoww enabled auto-merge (squash) April 9, 2025 19:52
@xingyaoww xingyaoww merged commit 35d49f6 into All-Hands-AI:main Apr 10, 2025
15 of 18 checks passed
@ryanhoangt ryanhoangt mentioned this pull request Apr 10, 2025
6 tasks
shabbir-shakudo pushed a commit to devsentient/OpenHands that referenced this pull request Jul 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants