OVS (Programmable multi-layer virtual switch)

OVS (Open vSwitch) is a software network switch that runs inside hypervisors and operating systems. It creates virtual networks, isolates traffic between virtual machines or containers, and connects them to the physical network. Unlike a simple Linux bridge, OVS supports complex rules, tunneling, and integration with SDN controllers via the OpenFlow protocol.

Open vSwitch is used in virtualization platforms such as XenServer and KVM, and forms the networking foundation for container orchestration systems like Kubernetes. It serves as a key element in software-defined networks, allowing traffic flows to be managed programmatically. Cloud providers use OVS to create isolated virtual private clouds and balance loads among thousands of tenants on the same physical hardware.

The main operational challenge is high CPU load when processing packets in userspace without hardware acceleration. Performance issues with tunneling are common, where VXLAN or GRE protocol encapsulation causes packet fragmentation. Improper configuration of OpenFlow flow rules or version mismatches in the OVSDB database after an upgrade can cause partial or complete loss of network connectivity in production environments.

OVS operating principle

The operating principle is based on separating the slow control path from the fast data path. The ovs-vswitchd component, a userspace daemon, makes packet forwarding decisions based on rules received from an SDN controller via OpenFlow or configured manually. When a virtual port first receives a packet with an unknown destination address, the packet is sent to userspace. There, the daemon analyzes the headers, matches them against the flow table, and determines the action: forward to a specific port, modify the header, or drop the packet. After the decision is made, an exact match rule is written into the kernel cache of the openvswitch.ko module. All subsequent packets of the same type are processed directly in the kernel via a hash table, bypassing the slow context switch. The OVSDB database stores bridge and port configurations, allowing changes to be made without restarting the switch. For outgoing traffic, OVS uses a virtual interface mechanism, connecting to the physical network card via a system bridge or controlling it directly using patch ports and hardware acceleration protocols like DPDK to achieve line-rate packet processing.

OVS functionality

  1. Packet pipeline and caching layers. OVS implements a multi-level flow caching architecture. The first level resides in kernel space (fast path), where packets matching megaflow cache entries are processed. On a cache miss, the packet is passed to userspace (slow path), where the ovs-vswitchd daemon computes the full set of actions through a sequential pipeline of OpenFlow tables.
  2. Exact match and megaflows. Unlike literal tuple caching, OVS uses a megaflow mechanism. These are maskable entries that aggregate many microflows with common behavior. The mask is generated dynamically based on which header bits were actually read by the classifier during slow path processing, drastically reducing the number of unique entries in the kernel cache.
  3. Tuple space classification. For OpenFlow table lookups at the user level, a tuple space search classifier is used. Incoming packets are hashed and placed into different tuples depending on the combination of masks defined in the installed flows. This approach provides high search speed for partially masked matches, avoiding a full linear scan of all rules.
  4. Fast path kernel. The kernel module openvswitch.ko implements a simplified processing engine. It does not interpret the full OpenFlow instruction set but instead executes a precompiled action block associated with a megaflow. Actions include setting header fields, modifying VLANs, and critically, multi-stage packet forwarding without returning to userspace for each hop.
  5. Recirculation and re-injection engine. To implement complex logic that requires context changes after header modification, a recirculation mechanism is used. A packet processed by a primary set of actions can be re-injected into the beginning of the pipeline with a new set of tunnel metadata. This allows atomic execution of operations that depend on the result of a previous modification, such as routing after DNAT.
  6. Composite actions and deferred execution. Actions in OVS are not simply a list of instructions. The system supports group tables and action buckets that implement complex scenarios. OVS compiles them into compact bytecode executed by the kernel. A feature is deferred output port binding, which correctly handles situations where the target port changes during early pipeline stages of packet processing.
  7. Netlink framework for upcalls. Communication between the slow and fast paths occurs via the Netlink protocol family (Generic Netlink). When the kernel module encounters a missing megaflow, it serializes a flow key and a truncated copy of the packet into a Netlink message. Netlink attributes describe the packet buffer state in detail, which is critical for correct handling of fragmented traffic in userspace.
  8. Fragment reassembly and handling. To provide visibility into transport layer (L4) information, the kernel module performs lightweight IP fragment reassembly. If fragments arrive out of order, OVS temporarily stores them in a reassembly buffer. Buffer management uses timeouts and memory limits to prevent resource exhaustion attacks; after reassembly, the virtual packet is sent for classification.
  9. Tunneling and metadata. OVS abstracts tunnel protocols (VXLAN, GRE, Geneve) via virtual ports. When an encapsulated packet is received, information about the outer header (source IP address) and the tunnel ID is converted into metadata attached to the inner packet. This metadata passes through the entire OpenFlow pipeline as registers, allowing policies based on overlay network topology to be applied.
  10. Geneve variable options handling. Unlike static VXLAN, OVS is deeply integrated with Geneve TLV (Type-Length-Value). The kernel parser can dynamically extract variable options from the Geneve header and map them to metadata fields. This allows manipulation not only of the base VNI but also of arbitrary option byte strings directly in match conditions and actions without userspace packet parsing.
  11. Stateful connection tracking via Conntrack. The integration module with Netfilter Conntrack allows OVS to perform stateful firewall functions. The ct() action passes the packet to the Linux connection tracker, which establishes the state of the entry (NEW, ESTABLISHED, RELATED). OVS can read state flags, labels (ct_label), and markers (ct_mark), atomically fixing them in flow metadata for subsequent switching based on TCP session status.
  12. Transactional OVSDB database. Bridge, port, and tunnel configuration is managed not via configuration files but through the OVSDB database (RFC 7047 protocol). The ovsdb-server daemon stores the data schema and services JSON-RPC requests. Clients (such as ovs-vsctl) and SDN controllers perform atomic transactions, guaranteeing configuration integrity under concurrent access from multiple management agents.
  13. Performance monitoring and PMD. To achieve line rate on x86 servers, OVS-DPDK uses Poll Mode Drivers. In this model, dedicated kernel threads constantly poll network card queues, bypassing the standard kernel interrupt stack. PMD threads are pinned to isolated physical cores, and packets are processed in batches, minimizing latency and CPU cache misses during high-speed forwarding.
  14. Deterministic trace simulation. The ovs-appctl ofproto/trace tool allows simulating the processing of a hypothetical packet. This utility runs a user-defined header through the full userspace OpenFlow table pipeline, showing the step-by-step decision-making process, matched entries, final actions, and drop points. This capability exists without sending real traffic, which is critical for debugging.
  15. Hardware offload. OVS supports partially or fully offloading flow processing to TC Flower drivers of compatible NICs (e.g., ConnectX). In this mode, megaflow entries are translated into Linux kernel Traffic Control rules and programmed directly into the network adapter chip. This frees the host CPU from processing massive static traffic.
  16. Connection tracking stateless offload. Beyond basic offload, OVS can delegate to the NIC hardware modification of connection tracking zone fields. At connection setup time using ct(commit), the driver programs the chip to change labels and NAT addresses for subsequent packets of that flow without CPU involvement. This provides service chains at line-rate port speed.
  17. Load balancing via select groups. OVS group tables implement complex logical primitives. The select group type uses symmetric L3/L4 hashing for load balancing. Unlike simple ECMP, the selection mechanism computes a hash based on pipeline metadata, allowing weighted failover (bucket weights) that distributes traffic proportionally among next hops.
  18. Trunk port and traffic isolation. The native VLAN implementation does not simply add or strip tags but works with the concept of service and client headers through push_vlan / pop_vlan actions. OVS strictly isolates traffic by using a trunk bitmask on the port, allowing only a whitelist of VLAN IDs, which prevents double-tagging attacks at the virtual switch level without external filtering.
  19. OpenFlow Bundle management. For atomic application of a group of flow changes, OVS implements Bundle operations (OpenFlow 1.4). Commands inside a bundle accumulate in a draft, not affecting active traffic, until the transaction is committed with the BUNDLE_COMMIT command. This prevents network race conditions where some old configuration rules conflict with newly applied ones, creating micro-loops.
  20. Policy-based mirroring. Port mirroring in OVS goes beyond simple port copying. The mirror selector allows an arbitrary filter via OpenFlow rules to select packets for mirroring. Only traffic with a specific DSCP priority that arrived on a particular port can be replicated and sent to an encapsulated GRE tunnel, enabling flexible out-of-band analysis systems.

Comparisons

  • OVS vs Linux Bridge. The key difference lies in traffic processing architecture. The Linux Bridge uses the standard kernel network stack for simple L2 forwarding based on MAC addresses, providing minimal overhead and high performance for simple topologies. Open vSwitch, in contrast, replaces kernel logic with its own daemon (ovs-vswitchd) and flow database, enabling programmable packet processing at the cost of somewhat higher resource consumption.
  • OVS vs SR-IOV. The comparison essentially comes down to the flexibility versus performance dilemma and the separation of responsibilities. SR-IOV (Single Root I/O Virtualization) passes virtual functions of a physical adapter directly to virtual machines, almost completely excluding the hypervisor from the data path, which guarantees near-line-rate throughput and minimal latency. The price is the complexity of live VM migration and the inability to apply advanced network policies at the hypervisor level that are easily available in OVS.
  • SR-IOV (Hardware-level input-output device virtualization)
  • OVS vs OVN (Open Virtual Network). Open Virtual Network can be seen as an evolutionary add-on that transforms OVS management architecture from local to distributed. While native OVS often relies on single agents and requires external daemons like the Neutron server for complex routing, OVN offers native support for logical routers, distributed DHCP, and metadata processing on each compute node via the OVSDB protocol. This gives OVN an advantage in control plane fault tolerance and scalability without radically replacing the datapath.
  • OVS vs proprietary distributed switches (using Cisco Nexus 1000V as an example). Open vSwitch differs ideologically in its open architecture and lack of strict vendor ecosystem lock-in. Products like Cisco Nexus 1000V historically offered deep integration with network equipment and a CLI environment familiar to administrators, but required a paid license and their own controller (VSM). OVS, however, was designed from the start as an open standard for interacting with third-party SDN controllers via OpenFlow, making it a more versatile building block.
  • OVS vs Tungsten Fabric (OpenContrail). The difference between OVS and Tungsten Fabric lies in their networking philosophy and functional maturity: OVS provides the fundamental switching layer on the host, whereas Tungsten Fabric is a comprehensive SDN platform that uses MPLS-over-GRE/UDP or BGP EVPN to create overlay networks out of the box. Unlike OVS, which requires external mechanisms for advanced routing, Tungsten Fabric deploys a distributed vRouter on each node with a rich service chain and built-in analytics system, simplifying federated deployments.

OS and driver support

Open vSwitch was originally developed for Linux and integrates deeply with the kernel network stack, using the openvswitch.ko module for high-speed kernel-space packet processing, but also supports Windows with some functionality implemented via a userspace daemon. To achieve near-wire performance, OVS provides integration with DPDK via the dpdk netdev type, allowing packet processing bypassing the kernel, and also supports AF_XDP, a modern eBPF-based socket type that operates faster than standard AF_PACKET by allowing a userspace driver to interact directly with the physical network adapter, bypassing many kernel subsystems.

Security

Security in OVS is achieved by using SSL/TLS for authentication and encryption of control traffic (connections to the controller and the OVSDB database), where the ovs-pki utility creates a certificate authority for signing switch and controller keys, as well as through IPsec integration for encrypting tunnel traffic between nodes, where hosts authenticate using X.509 certificates bound to chassis names. In the OVN context, Role-Based Access Control (RBAC) is additionally applied, restricting hypervisor rights in the southbound database so that a compromised node can modify only its own records, preventing destruction of the entire network configuration.

Logging

The logging system in OVS is modular and controlled at runtime via the ovs-appctl utility, which interacts with the running ovs-vswitchd daemon through a Unix socket. The administrator can dynamically change the verbosity level for a specific module (e.g., vswitchd, netdev_afxdp, reconnect) and output destination (file, console, syslog), executing commands like ovs-appctl vlog/set vswitchd:file:dbg for detailed debugging or ovs-appctl vlog/reopen to reopen the log file after rotation.

Limitations

Despite high performance, OVS has utilization limits: when the number of OpenFlow flows grows to 70–130 thousand on the br-int bridge (typical for large OpenStack installations), the ovs-vswitchd daemon may hit CPU resource limits, causing 100% core load and packet loss, which often requires tuning flow revalidation parameters (max-revalidator, max-idle). There are also logical limits on packet recirculation: when the depth of resubmit actions exceeds 4096, packet processing is interrupted with a warning in the log, which can occur when processing a flood of broadcast or multicast requests in networks with a large number of virtual routers.

History and development

The project was created in 2008 at Nicira (later acquired by VMware) as a universal software switch for SDN and was originally intended to be independent of specific products, which earned it wide recognition and over 3000 academic citations. A key milestone was the transfer of the project under the Linux Foundation umbrella in 2016, which legally cemented a neutral governance status and eliminated perceptions of single-vendor dominance, opening the door for community expansion and integration with container environments. Further development is associated with the OVN project for network virtualization and adaptation to the container era, where the focus shifts from serving only virtual machines to dynamic interaction with orchestrators and high-level network policies.