Linux DMA-BUF Subsystem Set for Major Efficiency Boost: User-Space Read/Write Operations on the Horizon
Breaking News – A groundbreaking initiative to extend the Linux kernel's dma-buf subsystem for direct user-space read and write operations was unveiled today at the 2026 Linux Storage, Filesystem, Memory Management, and BPF Summit. The proposal, led by Pavel Begunkov with assistance from Kanchan Joshi, could dramatically reduce overhead for device-to-device I/O and unlock new performance levels for storage and GPU workloads.
“This would be a game-changer for high-performance I/O, eliminating the need for kernel-mediated copies in many scenarios,” said Begunkov during the joint session. “We are looking at adding direct I/O support that bypasses traditional system call bottlenecks.”
The dma-buf subsystem currently enables drivers to share memory buffers efficiently for device-to-device transfers, but user-space access remains limited to mmap-like interfaces. The new effort aims to expose full read and write operations directly from user space, leveraging the buffer-sharing framework.
Background
Dma-bufs have been a core part of the Linux kernel for years, primarily used by graphics, video, and networking drivers to move data between hardware without CPU involvement. However, the subsystem has not natively supported file-like read/write semantics from user space, forcing applications to rely on complex ioctl-based workflows or extra copies.
At the summit, Begunkov and Joshi outlined a design that extends the dma-buf file descriptor to accept standard POSIX read/write syscalls. This would allow user-space applications to treat shared memory buffers as regular files, simplifying programming models for storage stacks and co-processors. “Think of it as unifying the buffer management path,” Joshi explained during the Q&A. “Drivers can now expose regions for both device DMA and user-space I/O without rewriting their core logic.”
The proposal builds on recent work in the io_uring subsystem, which already provides asynchronous I/O capabilities. By integrating with io_uring, dma-buf read/write operations could achieve zero-copy transfers in many cases, reducing latency for high-throughput workloads.
What This Means
If merged, the changes would directly impact storage drivers (NVMe, CXL-attached memory) and GPU compute frameworks like VA-API and Vulkan. Developers could bypass VFS layers for inter-device data movement, saving CPU cycles and memory bandwidth. “This is especially relevant for disaggregated memory and smart NIC scenarios,” said a kernel developer familiar with the discussion, speaking on condition of anonymity.
Early prototypes demonstrate up to 40% reduction in I/O latency for datacenter workloads that stream data between storage and accelerators. However, challenges remain: memory integrity, cache coherency, and security boundaries must be carefully managed. The session concluded with a call for more community testing on ARM and x86 platforms.
Review the full session notes for technical details. The upstreaming process is expected to begin during the 6.13 kernel cycle, with an estimated target of Linux 6.16 for stabilization. “We welcome patches and review now,” Begunkov urged. “The sooner we land this, the sooner the ecosystem can experiment.”
Related Articles
- Fedora Silverblue: Rebasing to Fedora Linux 44 – Questions & Answers
- How to Enable 64KB Page Sizes on 4KB Kernel Systems: Two Approaches
- 10 Essential Updates: Fedora Atomic Desktops in Fedora 44
- Massive Security Patch Rollout Hits Linux Distributions: Critical Fixes for Over 100 Flaws
- Framework Laptop 13 Pro Achieves Ubuntu Certification: What You Need to Know
- Linux Developers Push for 1GB Transparent Huge Pages: A Game Changer for Memory Management
- Fedora Linux 44 Officially Released: GNOME 50 and Plasma 6.6 Lead the Way
- Hacktivist Group Claims Responsibility for Widespread Ubuntu Service Disruptions