Description
In most of Flow-API, APIs are non-blocking. There are various key async APIs, but that's not blocking. There are however a handful of potentially-blocking APIs. Ideally these should behave in some reasonable way on signal, so that at least it would be possible to interrupt a program on SIGTERM while it is blocking inside such an API. In POSIX generally blocking I/O functions will emit EINTR in this situation. We should do something analogous; or failing that at least something predictable and consistent; or failing that present mitigations (but in that case not resolve this ticket).
Here are the existing blocking APIs as of this writing.
- Concept: Persistent_mq_handle::[timed_]send(), [timed_]receive(), [timed_]wait_sendable(), [timed_]wait_receivable(). Impls:
Posix_mq_handle::*()
- Internally this epoll_wait()s. Signal => EINTR; so we will also emit Error_code around EINTR.
- Assessment: Pretty good, in that it is interruptible with minimum fuss.
Bipc_mq_handle::*()
- Internally this (via some Boost code, long story)
pthread_cond_[timed]wait()
s. How this will act on signal is unclear; as of this writing I have not checked or do not remember also what the Boost code does in wrapping the pthread_ call. But the call itself is documented to yield no EINTR ever in someman
pages, and EINTR only from pthread_cond_timedwait() (not the one without timed_ though) in some otherman
pages. - Assessment: Unclear, but not good. It will almost certainly not work in some cases and will just keep waiting.
- Internally this (via some Boost code, long story)
- struc::Channel::sync_request() (+ future likely similar API(s) including sync_expect_msg())
- Internally this uses
promise::get_future().wait[_for]()
which almost certainly internally in turn usespthread_cond_[timed]wait()
. - Assessment: Probably similar to Bipc_mq_handle situation but would need to be verified.
- Internally this uses
- Nota bene: sync_connect() (in socket-stream and client-session) are today formally non-blocking, because it's non-networked IPC; but there are plans to make it networked-IPC and those guys in that case will become potentially-blocking. At that point we could leverage built-in EINTR probably; details TBD. Just keep it in mind.
Ideally all three -- and any future blocking impls -- should get interrupted if-and-only-if a built-in blocking call such as ::read() or ::poll() would be interrupted, on signal. How they're interrupted is less important: whether it's EINTR or another Error_code, as long as it's documented (that said perhaps it's best to standardize around an error::Code of ours we'd add for this purpose).
I'll forego explaining how to impl this. The Posix_mq_handle guy is already there; but the other two are a different story. (For Bipc_mq_handle one could somehow leverage the already-existing Persistent_mq_handle::interrupt_*()
APIs. But, then another thread would be necessary. Food for thought at any rate.) (For Channel, not sure, but we should be able to swing something; maybe an intermediary thread; signal => satisfy the promise.)
Lastly, but very much not leastly (sorry), users should be aware of the following mitigating facts, until this ticket is resolved:
- Persistent_mq_handle:
- This concept is not really primarily meant for direct use. It can be -- it is nicer than using the POSIX MQ or bipc::message_queue APIs directly, and it adds features including interruptibility to the latter -- but that's a bonus really; these exist to implement Blob_stream_mq_sender/receiver higher-level APIs, and those work just fine (there is no signal-interrupt issue): they (internally) avoid blocking calls except .wait_able(), and the latter is .interrupt_()ed if needed (namely from dtor).
- If you do use it:
interrupt_*()
is available to you. So if it is necessary to interrupt a wait or blocking-send/receive, then you can use a separate thread to run it in and invokeinterrupt_*()
from a signal handler. This isn't no-fuss, but it does work if needed at least.- With Posix_mq_handle, you do get the nice normal EINTR behavior already.
timed_*()
are also available: You could use those with a short, but not-too-short, timeout -- in which case a signal can affect them but only after a bit of lag; could be fine. This is the usual approach... not perfect but doable and pretty simple.
- struc::Channel::sync_request():
- This guy is only blocking if locally there is no
expect_msg[s]()
pending on the opposing side (for future speculated sync_expect_msg() API: if there is no already-received-and-queued-inside in-message of the proper type, or one isn't forthcoming immediately). - But in many -- most -- cases protocols are set up in such a way as to in fact make sync_request() non-blocking: It is expected the opposing side has set-up their
expect_msg[s]()
by the time we issue the sync_request(). That's just good for responsive IPC; and it's nice not to need an async call like async_request(), where a non-blocking one would do much more simply.- For example we use sync_request() internally in other parts of Flow-IPC this way routinely.
- This guy is only blocking if locally there is no