You set up two ring buffers (submission and completion), and submit opcodes representing system calls into the former. You then wait for stuff to arrive on the latter, which are structs representing system call results. This is effectively a way to run arbitrary system calls (at least ones that have opcodes implemented) asynchronously without switching to kernel mode!
Completions can be delivered or polled for. A fully-polled mode that lasts “forever” OR until one full idle second. The kernel will maintain a polling thread that delivers completions.