-
-
Notifications
You must be signed in to change notification settings - Fork 166
Coprocess Protocol Proposal
This document sketches a protocol to allow coprocesses to substitute for normal "batch" processes in shell scripts. A coprocess can be thought of as a single-threaded server that reads and writes from pipes.
The goal is to make shell scripts faster. It can also make interactive completion faster, since completion scripts often invoke (multiple) external tools.
Many language runtimes start up slowly, especially when there are many libraries or a JIT involved: Python, Ruby, R, Julia, the JVM (including Clojure), etc.
Startup times seem to be getting worse in general. Python 3 is faster than Python 2 in nearly all dimensions except startup time.
Let's call the protocol FCLI for now. There's a rough analogy to FastCGI and CGI. CGI starts one process per request, while FastCGI handles multiple requests in a process. (I think FastCGI is threaded unlike FCLI, but let's ignore that for now.)
Suppose we have a Python command line tool that copies files to a cloud file system. It works like this:
cloudcopy foo.jpg //remote/myhome/mydir/
(This could also be an R tool that does a linear regression, but let's use the above example to be concrete. The idea is that a lot of the work is "startup time", not actually doing work.)
It could be converted to a FCLI coprocess by wrapping main()
in a while True
loop.
A shell would invoke such a process with these environment variables:
-
FCLI_VERSION
-- the process should try to become a coprocess. Some scripts may ignore this! That is OK; the shell/client should handle it. -
FCLI_REQUEST_FIFO
-- read requests from this file system path (a named pipe) -
FCLI_RESPONSE_FIFO
-- write responses to this file system path (a named pipe)
The requests and responses will look like this. Note the actual encoding will likely not be JSON, but I'm writing in JSON syntax for convenience.
{ argv: ["cloudcopy", "bar.jpg", "//remote/myhome/mydir"]
env: {"PYTHONPATH": "."} # optional ENV to override actual env. May be ignored by some processes.
}
->
{ "status": 0 } # 0 on success, 1 on failure
If you wanted to copy 1,000 files, you could start a pool of 20 or so coprocesses and drive them from an event loop.
-
argv
-- run a new command and print a response to the fifo. Use stdin/stdout/stderr as normal. -
flush
-- flush stdout and stderr. I think this will make it easier to delimit responses from adjacent commands. -
echo
-- for testing protocol conformance? -
version
-- maybe?
Because it will be easier for existing command line tools to implement this protocol. Many tools are written with global variables, or they are written in languages that don't freely thread anyway (Python, R, etc.).
Shellac Protocol Proposal -- this protocol for shell-independent command completion can build on top of the coprocess protocol.