Description
While working on the io_uring event loop (see #15634), and the comments by @yxhuvud regarding O_NONBLOCK
I investigated the non-blocking behavior related to the OS and the event loop implementations, especially the blocking
argument to File
and Socket
and their relation to the polling (epoll, kqueue, libevent) and async event loops (iocp, uio_uring).
UNIX (poll, epoll, kqueue)
The problem on UNIX is that some file descriptors support O_NONBLOCK
while others don't, and it all depends on the file type. We even stat
the file descriptor to determine if we should set O_NONBLOCK
or not.
Regular disk files
By default we don't set O_NONBLOCK
on the underlying fd
when creating File
objects because disk files are always blocking (by POSIX definition). Since a disk file fd
is always ready, trying to read
or write
, even with O_NONBLOCK
, will never fail with EAGAIN
.
On Linux, trying to poll
or epoll
a disk file fd
fails with EPERM
(this is an error), while on FreeBSD trying to poll
or kevent
a disk file fd
reports it as ready (aka it's pointless to ask).
NOTE: there seems to be nothing wrong to always set O_NONBLOCK
on a regular file fd
. Maybe there used to be an issue with libevent? Nope, the fd
would never enter libevent (read
and write
would have to fail with EAGAIN
).
FIFO and character devices
Trying to open
a fifo can block the current thread unless the O_NONBLOCK
flag has been passed, in which case opening with O_RDONLY
will "succeed" (the fd
is created) even without a writer, while trying to open for write will fail with ENXIO
until reader has opened the fifo in read mode.
We can open("fifo", O_RDONLY|O_NONBLOCK)
then read
fails with EGAIN
and poll
or epoll
or kevent
eventually report when a writer also opened the fifo and there is something to read.
We can open("fifo", O_WRONLY|O_NONBLOCK)
but it fails with ENXIO
so we don't have a fd
that we could poll
, epoll
or kevent
on 😞
We can open("fifo", O_RDWR|O_NONBLOCK)
and it always succeeds because there is a reader and a writer, a fd
is created, a blocking read
or write
will fail with EAGAIN
and we can poll for both read
or write
.
NOTE: shall we always open as read under-the-hood? 🙈
An issue is that we'd write to the fd
buffer until it gets filled, at which point it will fail with EAGAIN
(and we can poll), or it fit into the buffer and we leave to do something else, without knowing if an actual reader will ever happen, or close and lose the message forever.
Alternatively, we can try to open with O_NONBLOCK
then on ENXIO
start a thread that will open
without O_NONBLOCK
, then we'd merely suspend the current fiber until the thread terminates (or call pthread_cancel
on timeout). We'd only block a fiber at the expense of a thread. The other cases for ENXIO are invalid cases (file is an UNIX socket or no such device), so we'd start a thread just to get the same error, but those sound exceptional.
Sockets
By default we always set O_NONBLOCK
on sockets.
I don't think the blocking
args make much sense. I'm even wondering if Socket#blocking[=]
methods are very useful sense. Why would you change a socket to blocking? You can't use them anymore with the polling event loops.
If you have real use-cases (not hypothetical) please report 'em 🙇
LINUX (io_uring)
The rings don't make a difference between regular disk file, fifo, character devices or sockets. Operations are always run async. Newer kernels don't make a difference between O_NONBLOCK
or not, while older kernels may fail with EAGAIN
when O_NONBLOCK
has been set on the fd
, so we should avoid setting O_NONBLOCK
on file descriptors and sockets to avoid issues with older kernels.
We can use IORING_OP_OPENAT
to always open the file async and never block the current thread, be they regular disk files, fifo or character devices.
The blocking
args are irrelevant. Even if we need to poll, we don't need O_NONBLOCK
to poll
or epoll
the fd
(or IORING_OP_POLL
), we only need it for read
and write
to not block.
The #blocking[=]
methods would only bring some potential bugs with older kernels by setting O_NONBLOCK
.
Windows (IOCP)
We can only set the OVERLAPPED
flag when creating the file handle. We can't change to blocking or non blocking afterwards because we can't enable or disable OVERLAPPED
.
But File
always sets blocking: true
and then we don't set OVERLAPPED
for files, and there are special checks to call ReadFile and WriteFile directly (no calls to the event loop). We thus never use the event loop on Windows to read/write files by default, only when we specify blocking: false
. That's unexpected.
@HertzDevil any technical reason for blocking read/write files on Windows? less context switches?
PROPOSAL(s)
- Move
open
to the event loop implementations, and have the event loop set the non-blocking flag (or not, for example io_uring). - Change the
blocking
arg ofFile
tofalse
by default, or maybenil
to mean "meh, we handle it". - Consider deprecating the
blocking
args?
Nothing wrong should happen to always open with O_NONBLOCK
for the polling event loops, whatever the actual file type (regular disk file, fifo, socket, ...). io_uring shouldn't set O_NONBLOCK
, and IOCP would like File to be async by default, too.
Pull Requests
- Add io_uring event loop #15634
- Fix async append to file in IOCP #15681
- Fix: reopen async
File
passed toProcess.exec
and.run
(win32) #15703 - Add
Crystal::EventLoop::FileDescriptor#open
#15750 - Ask system to decide non-blocking
IO::FileDescriptor
(win32) #15753 -
Crystal::EventLoop::FileDescriptor#open
now sets the non/blocking flag #15754 - Open special files in another thread #15768
- Refactor
Socket
blocking mode #15804 - Refactor
IO.pipe
blocking mode #15823