With io_uring you have to submit a new read request whenever the previous read request has completed. This is unnatural in a lot of cases because you usually just want to keep reading from a TCP connection. With epoll you just register a file-handle with the Kernel's epoll object once and then you get notified whenever new data is available to read. (What is "natural" is subjective, of course.)
There is, of course, the problem with epoll that you have to make repeated "read" syscalls to get to the actual data and in this regard io_uring is clearly better. So my statement mainly relates to the abstract semantics of the API. However, I could also see situations in which repeated read requests could pose a performance problem in io_uring, for example, for servers with a lot of connections (say, 20k) that all do a lot of very short reads (say, 4 bytes).
Am I missing something here? Can io_uring be used in a mode where a single submission-queue-entry (sqe) can result in multiple completion-queue-entries (cqe)?
Basic Explanation
In terms of the API, the proactor pattern (
io_uring
,IOCP
,ioring
) is superior to the reactor (epoll
,kqueue
, etc.) because it actually mimics the natural program control flow: you "call" some asynchronous function (by scheduling it for execution) and then wait for the result by reading the completion queue, or by waiting on the "completion port".In the blocking mode, the typical code looks like this (pseudocode):
The non-blocking mode in the proactor pattern is similar, it's just we can issue multiple syscalls at once (pseudocode again):
This model not only reduces the mental burden on the programmer but also unlocks the possibility to share the workload between multiple CPU cores under the hood by utilizing kernel threads. Such scaling is especially beneficial to the file IO because there is no default way of making truly asynchronous read or write calls without blocking a thread.
The previous Linux attempts to do the async file IO like the POSIX AIO were very limited and rather ugly, so the
io_uring
is an evolutionary step forward in the right direction.However, the proactor pattern obviously has some downsides such as the need to keep the buffers in RAM for each ongoing read/recv call. This is negligible at first, but once you have to handle many connections, you'll need a lot of memory that is not actively utilized and just waiting for completion.
io_uring
tries to partially solve this problem by offering the buffer pooling facilities, but that's still nowhere close to what you can do with a single-threadedepoll
event loop.Repeated scheduling problem
As for your problem of repeated scheduling, the
io_uring
actually offers the "multishot" mode for some of its calls:AFAIK, timeouts also support this mode, which in fact turns them into timers. But the main problem is that
io_uring
is still under development, so some of those features are available only in the newest Linux kernels (6.0+).Summary
So the answer is:
io_uring
is the better API which comes with a price, but handles the multi-threading, file IO, and other things just out of the box.epoll
, on the other hand, provides more granular control over buffering and function calls, but once you need to deal with files (or multiple threads), you're on your own.epoll
can still be relevant for low-memory devices, but on modern systems, it'd be more beneficial to plan forio_uring
support, because it's probably going to replaceselect
,poll
, andepoll
in the future.However, since
io_uring
is still under development, it's a constant source of dangerous vulnerabilities, so some companies like Google are putting it on hold. This fact is also worth considering when choosing between the two.