Skip to content

[RFC] Remove the opentitansession proxy #20726

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jwnrt opened this issue Dec 22, 2023 · 2 comments
Closed

[RFC] Remove the opentitansession proxy #20726

jwnrt opened this issue Dec 22, 2023 · 2 comments

Comments

@jwnrt
Copy link
Contributor

jwnrt commented Dec 22, 2023

Description

This RFC proposes removing the opentitansession proxy and support for the proxy as a transport from opentitanlib.

This proxy allows a local opentitantool to connect to another opentitantool session that's already running on a remote machine. The remote opentitantool connects to the proxy session as it would with other kinds of "transport" (FPGAs over USB, verilator over sockets, etc). This is useful when resources like FPGAs are connected to some remote machine and not a developer's local machine.

Issues with the proxy

This proxy is useful in theory, but in practice I don't think it should be implemented at the application level for these reasons:

  1. The proxy code adds significant complexity to opentitanlib.
    • All code for interacting with an OpenTitan platform must also add support for the proxy.
    • Many types, errors, and control flows must be made serializable and deserializable so they can be sent over a network.
  2. The proxy cannot be used with Bazel.
    • Almost all uses of opentitantool will be Bazel executing tests.
    • The remote side of the proxy must already be started before connecting, however Bazel needs to start opentitantool with settings specific to the test being run.
    • Some commands such as uploading bitstreams are unsupported.
    • The --interface cannot be set to the proxy in OpenTitan's configuration file as Bazel will overwrite the flag with cw310, hyper310, etc. depending on the test.

Alternatives

I think a more useful proxy would use existing technologies to run at either a higher or lower level than opentitansession:

  1. FPGA devices can be proxied to a local machine using the usbip program distributed with linux-tools.
  2. Bazel supports a remote execution protocol to allow tests to run on other machines.
    • Note that this is not trivial to set up.
    • We have had success using a remote Bazel cache, but not remote execution.
    • The issues we have had with Bazel's remote execution (and the Bazel buildfarm server) are not easy to debug.
    • Here is prior work to enable remote FPGA execution: Remote fpgas #16949

Questions

  • Does anybody use the opentitansession proxy and find it useful?
  • Would you like proxying to continue to be supported?
  • Do you have a reason for needing the proxy to be supported through opentitansession?
@jesultra
Copy link
Contributor

jesultra commented Jan 3, 2024

The Google Ti50 team uses opentitansession heavily for the automated testing of our TPM firmware running either on OpenTitan emulated on CW310, on legacy Titan chips on development shields mating on top of HyperDebug, or running as a Linux sub-process of opentitansession, the last one through the special --interface=host_emulation.

We do not use bazel in our test automation, but have precompiled opentitansession running as service for each of our testbeds, started with parameters that match the particular setup. These machines to which HyperDebug is connected do not have a rich execution environment, and does not support out Go language test scripts. We run the test scripts on another pool of machines, and the Go scripts invoke pre-compiled opentitantool to connect with one of the services. We do it this way to stay consistent with ChromeOS test lab practices.

  • The "host emulation" feature (running application code not on actual OpenTitan achitecture, but compiled for Linux) is intricately linked with opentitansession, and cannot work without it. We would prefer not to lose this feature, which has taken considerable effort to achieve, and is highly useful for catching some of our firmware bugs in pre-submit checks, without needing any hardware.
  • Some of our test scripts use multiple threads, which are all invoking opentitantool to perform interleaved operations, I have a sense that doing so would result in errors trying to claim the USB "console" interfaces of HyperDebug, if each opentitantool attempted to individually establish USB connection. Having a single long-running opentitansession manage the USB connection, which processing requests received via TCP/IP from multiple opentitantool invocations solves that issue.

I realize that it is a burden that pretty much every io trait method must also be declared in the proxy protocol, and boiler-plate code added to issue and handle proxy requests. But the advantages to the Ti50 team above are significant.

I would like to learn more about what features have been hindered by having to add support for the proxy. It is my understanding that it is not very often that new io trait methods are added.

New higher level features added to the TransportWrapper layer or above, (e.g. reading and decoding SFDP descriptor, by issuing several low-level SPI reads), do not necessarily need proxy support. Only if performance concerns cause a desire to have the high-level operation as a whole proxied to the session service, rather than each individual low-level operation.

@jwnrt
Copy link
Contributor Author

jwnrt commented Jan 4, 2024

Thanks a lot for these details, they're very useful.

I would like to learn more about what features have been hindered by having to add support for the proxy. It is my understanding that it is not very often that new io trait methods are added.

I'm not sure about new features, but my motivation for opening this was this issue which I tried to fix by refactoring some of our IO handling (mainly the UART).

I got the feeling most of the complexity of the UART would go away if we only supported TTYs, however the proxy, "ultradebug" and the Ti50 emulator all have their own IO systems. I haven't (or cannot) test with these platforms, so first of all wanted to check that we definitely need to continue supporting them.

@jwnrt jwnrt closed this as not planned Won't fix, can't repro, duplicate, stale May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants