Descriptorless files for io_uring


The lowly file descriptor is one of the fundamental objects in Linux systems.

Interestingly, though, the io_uring subsystem looks as if it is moving toward its own number space separate from file descriptors.

  • The kernel needs to perform some setup for io_uring operations on a file descriptor. It takes a reference to the file & “locks down” the memory for the buffer (what does this mean concretely?)
  • The io_uring_register syscall allows userspace to explicitly request that this setup is performed for a list of buffers (opcode: IORING_REGISTER_BUFFERS) and/or files (opcode: IORING_REGISTER_FILES)
  • If IORING_REGISTER_FILES is called, all “registered” (or fixed) files must then be referenced by their index in the list of files passed to io_uring_register, and not their fd.
  • If you’re using io_uring to create a fd in the first place, there’s an unnecessary user-space conversion step:
    • io_uring creates the file, and puts the fd on the buffer
    • userspace calls io_uring_register register the file and receive a fixed file offset
    • userspace then enqueues operations on this file using the fixed file offset
  • There’s a new patch series out that allows io_uring to create and register fds, and return their fixed file offsets instead of their fd numbers.

The most likely use case for this feature is network servers; a busy server can create (with accept()) and use huge numbers of file descriptors in a short period of time. While io_uring operations, being asynchronous, can generally be executed in any order, it is possible to chain operations so that one does not begin before the previous one has successfully completed. Using this capability, a network server could queue a series of operations to accept the next incoming connection (storing it in the fixed-file table), write out the standard greeting, and initiate a read for the first data from the remote peer. User space would only need to become involved once that data has arrived and is ready to be processed.

Edit