QXL (Fast quantization and deblocking filtering)

QXL is a hardware video codec embedded in QEMU virtualization chipsets, providing basic 2D graphics for guest operating systems. It emulates a simple graphics adapter without 3D acceleration support, acting as a bridge between guest commands and the host display.

QXL is used in virtualization environments such as KVM and Xen for undemanding server or desktop systems. It is in demand for remote access via VNC or SPICE where complex 3D graphics rendering is not required, but minimal output latency for simple interfaces is important.

Typical QXL issues

The main drawbacks are high host resource consumption when changing guest resolution and the lack of hardware video acceleration. Frame tearing artifacts are often observed during intensive window scrolling. Drivers for older operating systems may be unstable, and performance drops at resolutions above Full HD.

How QXL works

QXL functions as a paravirtualized driver: the guest operating system sends not a pixel buffer but lists of drawing commands (rectangles, copies, fills). The host server interprets them via QEMU and renders using the CPU rather than the GPU. Unlike purely emulated cards (such as Cirrus Logic), QXL allows dynamic resolution changes without rebooting and supports up to 4 GB of video memory. Compared to VirtIO-GPU, QXL is inferior in frame processing speed (it does not use tensor block compression) and loses completely to VFIO passthrough of a real GPU. However, its main difference from basic VGA is the presence of a deblocking filter (smoothing compression block boundaries) and bidirectional command synchronization, which reduces parasitic load on the host CPU. Ultimately, QXL is a compromise between compatibility and speed for 2D workloads, requiring no complex iommu configuration.

Functionality

  1. Definition of QXL. QXL (Quick Xen Link) is a mechanism for synchronizing video memory buffers between a virtual machine and the Xen hypervisor, implemented at the paravirtualization level. The function manages a ring buffer of commands and events, enabling the transfer of 2D operations.
  2. Transfer architecture. QXL uses shared memory between domain 0 and an unprivileged domain. The hypervisor maps the guest OS video memory pages directly into the address space of the QEMU process, avoiding unnecessary copying.
  3. Command ring buffer. The main structure is a circular buffer storing drawing commands: rectangle fill, block copy, font operations. The guest OS places commands into the buffer, updating the producer pointer. Domain 0 reads using the consumer pointer.
  4. Event management. Asynchronous events (for example, completion of rendering) are transmitted via the hypervisor event channel. The guest OS injects a virtual interrupt through the HYPERVISOR_event_channel_op hypercall, notifying the QXL server that frames are ready.
  5. Synchronization model. QXL operates on a lock-free shared memory principle. To avoid data races, atomic operations with memory barriers are used. The performance formula is throughput = buffer size / command cycle time.
  6. Latency formulas. Command processing latency is Δt = t_guest → t_qxl + t_qxl_processing + t_display. Here t_guest is the time spent writing to the buffer, t_qxl_processing is the QXL server decoding time, and t_display is host rendering time.
  7. Buffer for drawing commands. Each command has a header: opcode, coordinates (x, y, w, h), and color (RGBA). The buffer size is fixed (2 MB by default). When overflow occurs, the guest blocks until space becomes available (wait_event).
  8. Compression mechanism. QXL implements lossless compression for screen images using the LZ77 algorithm with color plane prediction. The compression ratio is K = original size / compressed size. Typical K for text is up to 8:1, for photos 1.2:1.
  9. Refresh rate mode. The guest OS can set a maximum refresh rate in the buffer: f_max = 1 / (T_render + T_serialize). If f_max exceeds host capabilities, QXL automatically reduces the frame rate by skipping commands.
  10. Dirty region handling. QXL tracks changed rectangles using a bitmap. The update region size is S_update = Σ(Δx × Δy) for all dirty blocks. Only changed regions are transmitted, reducing traffic by 60–80%.
  11. Video memory virtualization. Up to 256 MB of guest video memory is allocated via the Xen Grant Tables interface. Physical pages are mapped as pseudo-physical. The mapping formula is guest PFN → host MFN via the grant table.
  12. I/O interface. QXL does not fully emulate PCI. Instead, it implements its own protocol over shared memory with a mini header: magic (4 bytes), length, and CRC32 checksum.
  13. Priority command queues. There are three types of rings: high (cursor, input events), normal (graphics), low (background operations). The QXL scheduler services the high ring every 100 microseconds, normal every 10 milliseconds, low every 50 milliseconds.
  14. Performance statistics. Monitoring parameters include: cmd_latency (average 120–200 microseconds), throughput (up to 1200 commands per millisecond), and command loss ratio = dropped / total. If the threshold of 5% is exceeded, QXL increases the buffer size.
  15. Resynchronization mechanism. When pointers become desynchronized (producer == consumer + 1 with an empty buffer), QXL performs a state reset via a hypercall. Recovery takes T_resync = T_flush + T_reinit, typically 300–500 microseconds.
  16. Integration with the SPICE protocol. QXL is a backend for SPICE (Simple Protocol for Independent Computing Environments). The transformation is QXL command → SPICE message marshaling → transmission to the client. On the client side, reverse unmarshaling is performed.
  17. Cursor memory management. The cursor is stored in a separate 64×64 pixel buffer with hardware overlay. The cursor latency formula is Δt_cursor = t_update_host + t_kernel_event, where t_update_host is the time to update the position on the host (approximately 50 microseconds).
  18. Multicore optimization. Each command ring is bound to a specific CPU core via IRQ affinity. Affinity is used: the guest core writes to the buffer for CPU n, and the QXL server processes on the same core, reducing cache coherence overhead.
  19. Buffer capacity calculation. The optimal buffer size is B_opt = (N_cmds × S_cmd_max) × 1.5 (safety margin). The derivation formula: with N_cmds = 4096 commands and S_cmd_max = 256 bytes, B_opt = 4096 × 256 × 1.5 ≈ 1.5 MB. In practice, 2 MB is used.
  20. Failure states and reset. On a checksum error or timeout exceedance (T_timeout = 100 ms), QXL initiates a full reset: clearing all rings, reinitializing grants, and remapping video RAM. Frame loss during reset is up to 1–2 frames.

Comparison with similar technologies

  • QXL (Quick Synthetic Video Coding) vs HEVC (High Efficiency Video Coding). QXL is optimized for synthetic video (screen recordings, GUI) using contextual caching blocks, while HEVC is oriented toward natural video. HEVC requires more computational resources for inter-frame prediction but provides better compression for complex natural scenes where QXL loses efficiency.
  • QXL vs AV1 (AOMedia Video 1). AV1 surpasses QXL in bit efficiency for high-motion content thanks to advanced filters and deblocking. However, QXL wins in low encoding latency for remote desktops using simplified transform blocks, which is critical for real-time applications where AV1 requires multi-pass analysis.
  • QXL vs H.264/AVC (Advanced Video Coding). H.264 is an industry standard with broad hardware support, but its macroblock architecture is less adaptive to static pixel artifacts in screen recordings. QXL uses intelligent caching of duplicate regions, reducing bitrate by up to 40% for office applications, yet falls behind H.264 in encoding arbitrary video.
  • QXL vs VP9 (WebM Project). VP9 provides superior compression for 4K content through a combination of 64×64 superblocks and an extensive set of prediction directions. QXL, in contrast, relies on detecting repeated frames in slideshows and terminal sessions, saving energy on mobile systems but being inefficient for high-motion video.
  • QXL vs intra-frame codecs (MJPEG, Motion JPEG 2000). Intra-frame codecs, encoding each frame independently, offer simple random access and error resilience but suffer from low compression. QXL employs inter-frame caching and differential encoding for sequential screen blocks, achieving 3–5 times better compression while maintaining quality on static graphical user interfaces.

OS and driver support

QXL (QEMU Video) is implemented as a paravirtualized graphics driver operating as part of virtio devices. It provides high-performance video output in guest operating systems (Linux, Windows) via VIRTIO ring buffers, but requires installation of guest additions (e.g. SPICE Guest Tools) and a loadable kernel module for direct framebuffer access. It supports dynamic resolution changes and asynchronous commands through an I/O port.

Security

QXL security is based on isolation provided by QEMU and the KVM hypervisor. The driver runs in guest unprivileged mode, and all video memory operations go through shared memory with explicit mapping and buffer boundary checks in QEMU code, eliminating direct DMA attacks. Vulnerabilities related to integer overflows in command handlers have existed historically (e.g. CVE-2016-8576), but these are minimized through strict command size validation and disabling unsafe acceleration modes by default.

Logging

Logging in QXL is implemented at two levels. Inside the guest driver, it uses system calls such as printk (Linux) or DbgPrint (Windows) with filtering via a debug mask (QXL_DEBUG). In QEMU, logging uses the QEMU Tracing Framework and the -trace qxl* option, capturing surface commands, rectangle update operations, and I/O errors. All command parsing errors and memory allocation failures are written to the hypervisor log (qemu.log or syslog) with LOG_WARNING and LOG_ERR levels.

Limitations

Key limitations of QXL include: support for 2D acceleration only and no hardware 3D rendering (offloaded to software OpenGL emulation via QEMU). Maximum resolution is limited to 2560×1600 with 32-bit color on older versions without hardware cursor. It also requires the SPICE protocol for image transfer, making QXL inefficient for remote access over non-SPICE protocols such as VNC or RDP. Additionally, the driver does not support sleep or hibernation mode in Windows without restarting the SPICE session.

History and development

QXL was developed by Qumranet (acquired by Red Hat) in 2008 as part of the SolidICE virtualization stack, and was then integrated into QEMU and SPICE in 2010–2012, becoming the standard video driver for paravirtualized desktops in RHEV and oVirt. After 2017, its development slowed in favor of virtio-gpu with 3D and Vulkan support. However, QXL remains in QEMU as a stable legacy driver for thin clients and industrial systems where minimal computational load and absence of modern graphics are critical.