VMXNET3 (Paravirtualized network adapter with hardware offloading)

VMXNET3 is a third-generation virtual network adapter developed by VMware. In short, it acts as a high-performance bridge between the virtual machine and the physical network, allowing the guest operating system to directly and efficiently interact with the hypervisor without emulating real hardware.

This adapter is used in corporate virtualization environments on the vSphere platform and VMware Workstation. Its main purpose is to service workloads with intensive network traffic that require minimal latency and high throughput. Such scenarios include database servers, VoIP applications, streaming video systems, web farms, and financial transaction platforms where standard emulated E1000 adapters become a bottleneck.

The most common problem is a performance drop due to missing or incorrect versions of VMware Tools drivers in the guest system. Without them, the adapter functions in legacy model emulation mode, losing all its advantages. Connection breaks also occur when the large packet receive feature is enabled on physical switches that do not support jumbo frames end-to-end. In high-load systems, interrupt queue overflow can occur when there are insufficient processor resources allocated for network stack processing.

How VMXNET3 works

The operating principle of VMXNET3 is based on a paravirtualized architecture, where the guest operating system is aware of the hypervisor’s presence and interacts with it through an optimized software interface instead of hardware register emulation. The adapter organizes data transfer via ring buffers consisting of descriptor structures for transmit commands, receive commands, and completion events. These buffers are placed in shared memory between the virtual machine and the hypervisor, which eliminates costly data copying.

When sending a packet, the guest OS driver forms a descriptor containing pointers to fragmented data buffers and places it in the transmit queue. A special hypervisor instruction is then executed, informing the virtual machine monitor level about the need to process the accumulated commands. Incoming traffic is delivered through the receive queue, while the Receive Side Scaling mechanism distributes interrupts across different virtual processors, allowing load scaling on multiprocessor configurations.

The key acceleration is achieved thanks to hardware offloading functions implemented at the hypervisor level. TCP and UDP checksum calculation is delegated to the virtual switch, and TCP segmentation offload allows the guest OS to send data blocks up to 64 kilobytes in size, which the hypervisor splits into standard segments, offloading the virtual machine’s central processor. Additionally, the use of a paravirtualized interrupt mechanism reduces the load on the hypervisor scheduler by batch processing events instead of generating an interrupt for each individual packet.

VMXNET3 functionality

  1. Transmit and Receive Queue Mechanism. The core of the architecture is a ring buffer divided into separate queues: Tx (transmit) and Rx (receive). Each queue consists of a ring of descriptors pointing to buffers in the guest memory. Such segmentation allows asynchronous packet processing, eliminating mutual locks between send and receive data threads.
  2. Transmit Packet Aggregation. To reduce the load on the hypervisor, a batch send mechanism is used. The guest OS driver accumulates several outgoing packets in the ring buffer before initiating a single interrupt or hypervisor call. This reduces the number of context switches and increases throughput under bursty traffic conditions.
  3. Large Receive Offload Technique. VMXNET3 supports LRO, aggregating several incoming TCP segments belonging to the same flow into one large packet in the hypervisor memory before passing it to the guest system. This radically reduces CPU costs for header processing and decreases the number of data copy operations in the guest protocol stack.
  4. Hardware Computation Offloading. The offloading functionality includes TCP Segmentation Offload and checksum calculation. VMXNET3 allows the delegation of large data block segmentation into TCP segments and the calculation of IP/TCP/UDP checksums to the network equipment simulator on the hypervisor side, saving virtual processor resources.
  5. Interrupt Management. The adapter uses a sophisticated event delivery scheme to balance latency and load. Under high traffic intensity, an interrupt moderation mechanism is applied, temporarily switching processing to polling. This prevents a live lock state, where the processor is solely occupied with handling an avalanche of network interrupts.
  6. Receive Side Scaling Queue Support. VMXNET3 supports RSS with up to eight hardware receive queues. Hashing of incoming packets allows their flows to be distributed across different virtual processor cores. In multiprocessor configurations, this scales the network processing load, avoiding bottlenecks on a single core.
  7. Dynamic Queue Activation. Unlike static configurations, the VMXNET3 driver is capable of adaptively changing the number of active receive and transmit queues. The mechanism activates in response to virtual machine configuration changes, optimizing memory consumption during idle times and ensuring maximum parallelism under load.
  8. Advanced Filtering Capabilities. The adapter implements hardware filtering based on MAC addresses and wake-up patterns. Multicast and broadcast filtering mode allows discarding unnecessary frames at the hypervisor switch port level, preventing them from entering the guest OS stack and waking the virtual machine unnecessarily.
  9. VLAN Trunk Support. VMXNET3 fully emulates the functionality of the IEEE 802.1Q standard. The driver can process tagged frames, insert, and extract VLAN headers. This allows the virtual machine to participate in trunk connections, servicing traffic from multiple virtual local area networks without intermediate routing.
  10. Single Root I/O Virtualization Technology. Through the standard SR-IOV interface, the adapter can function in paravirtualized mode for control traffic, while the main data flow is passed through directly via the physical function of the device. This provides a hybrid model preserving migration capability with near-native performance.
  11. SR-IOV (Hardware-level input-output device virtualization)
  12. Jumbo Frame Mechanism. VMXNET3 supports Jumbo Frames with an MTU size up to 9000 bytes. Increasing the payload in a single frame reduces the ratio of service headers to data. This is critically important for storage environments and high-performance computing, where maximum throughput is required when copying large blocks.
  13. Data Integrity and Hashing. In configurations supporting NVGRE and VXLAN, the adapter ensures hash calculation of encapsulated traffic. VMXNET3 is capable of looking inside encapsulated headers for correct load distribution across RSS queues, which is relevant in software-defined networks and overlay tunnel environments.
  14. Power Management. The driver supports the Wake-on-LAN specification for bringing a virtual machine out of a suspended state. The network interface analyzes incoming traffic for a magic packet or specified signatures, initiating a power event upon detection without the main processor’s involvement.
  15. Memory Registration Semantics. Unlike copying data with every transfer, VMXNET3 uses memory region registration. Guest buffers are mapped directly into the hypervisor address space, allowing the data transfer mechanism to perform operations directly from user space to the physical network adapter.
  16. Device Pass-through Compatibility. The adapter is designed for seamless replacement. When migrating from an emulated device or direct PCI passthrough to VMXNET3, the identity of the MAC address and port state is preserved. The guest system perceives the transformation as a simple driver replacement without reconfiguring network profiles.
  17. Frame Ordering Guarantees. VMXNET3 supports strict transmission ordering within a single flow. The adapter guarantees that the sequence of sending descriptors in the Tx queue exactly matches the order of frame output, eliminating race conditions critical for protocols sensitive to packet reordering.
  18. Hypervisor-Level Security. The NetQueue function within VMXNET3 interacts with the distributed virtual switch for memory pool isolation. Each queue is processed in a separate hypervisor kernel module context, which prevents data leakage between different virtual machines through shared buffer structures.
  19. Guest Tagging Processing. VMXNET3 supports Guest VLAN Tagging emulation, allowing the operating system inside the virtual machine to manage tags independently without explicit port group configuration. The virtual switch must preserve and transmit user-defined CoS priorities set by the application in the guest OS.
  20. Diagnostics and Monitoring. The driver exports detailed statistics via SNMP and VMware API. Counters are available for dropped packets due to ring buffer shortage, checksum errors, and overflow events. These metrics allow precise diagnostics of collisions at the virtual hardware level without analyzing traffic inside the guest.
  21. Seamless State Migration. During the vMotion process, VMXNET3 preserves the state of all ring buffers and queue pointers. After migration to the target host, the adapter continues processing from the last acknowledged transaction. The network connection loss is measured in fractions of a millisecond due to incremental copying of dirty memory pages.

Comparisons

  • VMXNET3 vs E1000E. VMXNET3 is a paravirtualized network adapter requiring VMware Tools, while E1000E emulates real Intel 82574 hardware without the need for guest OS drivers. VMXNET3 provides significantly higher throughput and lower CPU load by minimizing emulation, whereas E1000E sacrifices performance for maximum compatibility with operating systems without additional components.
  • VMXNET3 vs SR-IOV. Unlike passing SR-IOV virtual functions directly bypassing the hypervisor, VMXNET3 operates through the vSwitch software switch, retaining full support for vMotion and snapshots. VMXNET3 guarantees the operation of NSX features, micro-segmentation, and the distributed firewall, which are unavailable when using SR-IOV passthrough access, but it is inferior to the latter in maximum speed and latency determinism under peak loads.
  • VMXNET3 vs VMXNET2. The third generation of adapters radically surpasses its predecessor due to support for RSS, multi-threaded traffic processing, and MSI/MSI-X interrupts, which VMXNET2 lacked. VMXNET3 gained hardware TCP Segmentation Offload and Large Receive Offload, NetQueue queue aggregation, and IPv6 Checksum Offload support, making it the only choice for environments with I/O virtualization at speeds of 10GbE and above.
  • VMXNET3 vs PVRDMA. VMXNET3 services the standard TCP/IP stack through the vmxnet3 guest driver, whereas PVRDMA implements remote direct memory access via RDMA for ultra-low latency. Although PVRDMA provides microsecond latency comparable to physical InfiniBand, VMXNET3 remains a universal interface for any traffic, not requiring specific hardware support or complex application code rework for the OFED interface.
  • VMXNET3 vs vmxnet. The original flexible vmxnet adapter was VMware’s first step into paravirtualization but had serious limitations in the number of queues, lacked jumbo frame support, and had unstable LRO operation. VMXNET3 is completely redesigned architecturally: it emulates a modern chip with multiple hardware queues and advanced interrupt coalescing mechanisms, eliminating performance degradation when balancing traffic between guest system cores.

OS and driver support

VMXNET3 is implemented through a paravirtualized architecture, where guest OSes require the installation of VMware Tools containing an optimized network driver that uses shared memory between the host and guest instead of emulating physical hardware, allowing latency minimization through direct access to the hypervisor via a ring buffer of transmitted packet descriptors, where drivers are built into the Linux kernel starting from version 2.6.32, are included in Windows Server distributions starting from 2012 R2, and for FreeBSD, Solaris, and macOS guest systems support is provided through open-vm-tools or separate driver packages, while the adapter dynamically negotiates the driver version and available offloads such as TSO, LRO, RSS, and VLAN filtering through the PCI configuration space register emulation mechanism.

Security

Secure data transmission in VMXNET3 is achieved by aggregating checksum offload and TCP segmentation offload at the driver level, which excludes interference with packet integrity from neighboring virtual machines, while hardware isolation is implemented through IOMMU (Intel VT-d or AMD-Vi), assigning unique memory domains to each adapter to prevent DMA attacks between VMs, and the built-in NetQueue (RSS) support at the vSwitch port level allows traffic balancing across multiple queues, preventing DoS attacks through saturation of a single guest system processor core, where VLAN tag filtering is performed in hardware on the hypervisor side before placing the packet into the guest ring buffer, isolating unauthorized broadcast domains without load on the virtual machine CPU.

Logging

VMXNET3 logging mechanisms are implemented through a three-level event collection architecture: at the driver level in the guest OS, statistics on dropped packets due to Rx-ring overflow, checksum errors, and link state change events are recorded via ethtool counters and Windows Performance Monitor, at the hypervisor level a detailed log of adapter activation, DMA mapping failures, and vMotion migrations is kept in vmkernel.log files with NetPort and Vmxnet3 tags, and at the vSphere Distributed Switch level, NetFlow v5/v9 events are recorded with full flow telemetry including source/destination IP, ports, and session termination reason, where access to guest driver logs is possible through dynamic debugging level increase of the kernel module without rebooting the VM using the esxcli network nic debug command.

Limitations

The main limits of VMXNET3 are determined by the architectural boundaries of paravirtualization: the maximum throughput of a single adapter is limited to 40 Gbps due to the fixed ring buffer size of 4096 descriptors and the interrupt coalescing timer frequency, which is adaptively adjusted from 0 to 100 microseconds, while the adapter does not support SR-IOV and physical function passthrough (PCI passthrough), making direct access to InfiniBand or RDMA hardware impossible without a software bridge, the number of VMXNET3 vNICs per virtual machine is strictly limited to 10 adapters due to the PCI slot limit in the virtual chipset, and Jumbo Frames support up to 9000 bytes requires simultaneous activation on the vSwitch, port group, and guest driver with mandatory LRO disabling to avoid datagram fragmentation.

History and development

The VMXNET3 network adapter appeared in VMware vSphere 4.0 in 2009 as an evolution of VMXNET2, receiving a multi-queue RX/TX architecture with MSI-X interrupts instead of its predecessor’s single queue, which allowed scaling performance proportionally to the number of vCPUs, in vSphere 5.5 the hardware flow distribution mechanism NetQueue for distributed switches and IPv6 TSO/checksum offload support were introduced, the vSphere 6.0 release brought adaptive interrupt coalescing and Large Packet Aggregation, and starting from vSphere 6.7 full integration with Virtual Hardware version 14 was implemented, adding support for Guest VLAN Tagging and multicast filtering on the adapter side with the ability to process up to 128 unique MAC addresses for network virtual functions.