What is PF (Hardware virtualization of Input-Output devices)

PF (Physical Function) is a mechanism of a PCIe device that allows the host system to manage it using standard drivers. The function provides access to global resources such as extended memory and power management while enabling the creation of lightweight virtual copies.

PF is used in server virtualization environments (KVM, VMware, Hyper-V) and in network cards supporting SR-IOV. For example, a physical network card presents one PF to the hypervisor, which then configures virtual functions (VFs) for guest operating systems. PF is also used in NVMe controllers and GPUs for direct access from virtual machines.

SR-IOV (Hardware-level input-output device virtualization)

Typical problems

The main problem is the limited number of PFs on a single device (usually one or two), creating a single point of failure. Improper PF configuration leads to interrupt conflicts or DMA memory leakage between virtual machines. Furthermore, not all operating systems have stable drivers for direct PF operation, and hardware errors at the PF level often require a complete reboot of the physical node.

How PF works

The physical function is implemented directly in the hardware of the PCIe device and corresponds to one logical interface within the PCI configuration space. After power-on, the host system performs bus enumeration and assigns unique identifiers (Bus:Device.Function) to the PF. The host driver gains exclusive access to the PF’s base address registers (BARs), through which it configures queue mappings, MSI-X interrupts, and memory isolation mechanisms. When SR-IOV is activated, the PF acts as a managing agent: the host writes the desired number of virtual functions into special PF registers, after which the device independently generates additional PCIe functions with hardware-enforced isolation. All DMA transactions from the created VFs pass through an address translation cache (ATC) in the root complex or the device itself, while the PF remains the only channel for global operations such as resetting the entire board, updating microcode, or reassigning bus bandwidth.

Physical Function functionality

PF as hardware virtualization. Physical Function (PF) is a full-featured PCIe function supporting configuration space, I/O spaces, and memory space. The PF is discovered by the hypervisor as a normal device, providing direct access to physical resources without intermediate emulation.
PF configuration registers. Each PF has a unique Device ID, Vendor ID, and Bus/Device/Function number (BDF). These registers are accessible via the PCIe Configuration Space mechanism, allowing the host driver to initialize and configure the PF using standard methods.
Base addressing BAR in PF. The PF owns the full set of Base Address Registers (BARs) defining physical memory areas and I/O ports. Through the BARs, the PF provides access to its own internal registers, queue buffers, and control structures without intermediaries.
PF interrupt management. PF supports MSI (Message Signaled Interrupts) and MSI-X with dedicated interrupt vectors. When a hardware event occurs, the PF initiates a memory write, generating an interrupt for the driver, which is critically important for error handling and operation completion events.
SR-IOV support via PF. The PF is the root element for Single Root I/O Virtualization (SR-IOV), managing the creation and deletion of virtual functions (VFs). Through special SR-IOV registers within the PF, the hypervisor sets the number of VFs, their initial BDFs, and the allocated resources.
Configuring VF spaces. The PF exports control registers that define the BAR ranges for each VF. For example, the PF specifies an offset in its memory where separate message queues for each VF are placed, ensuring hardware isolation of virtual machine traffic.
VF (Hardware I/O virtualization mechanism)
Transaction filtering and routing. The PF contains PCIe transaction routing logic, redirecting read/write requests from VFs to the corresponding physical resources. PF filters check the Requester ID (Bus/Device/Function) and decide whether to allow access to the shared area.
PF initialization and reset. During a cold or hot bus reset, the PF performs a full reset of its internal state, including configuration registers and SR-IOV control structures. After discovering the PF, the driver loads firmware, configures queues, and enables hardware transaction processing.
Access control at the PF level. The PF implements Access Control Services (ACS) mechanisms, determining whether transactions from one PF can pass to another PF on the same switch port. This prevents unwanted bypass paths and ensures isolation between different physical functions in multi-tenant environments.
Error management in PF. The PF accumulates error statuses in Advanced Error Reporting (AER) registers, logging malformed packets, completion timeouts, and parity errors. When a fatal error is detected, the PF generates a non-maskable interrupt and enters a recovery state.
Power consumption and PF power management. The PF supports D0, D1, D2, and D3hot/cold states per the PCIe Power Management specification. The transition to the D3hot state occurs by writing to the PMCSR register, allowing the PF to disable internal clocking and reduce consumption.
Thermal and current protection of PF. Temperature sensors and current monitors may be present within the PF. When thresholds are exceeded, the PF automatically reduces operating frequency, adjusts voltage, or issues a warning via control interface registers.
Hardware queues in PF. The PF often integrates multiple queue pairs directly into its controller. Each queue is bound to a specific VF or allocated for the PF’s own control traffic, offloading the central processor from packet header processing.
Direct memory access (DMA) management by PF. The PF configures address translation tables for DMA requests from VFs using an integrated IOMMU or its own remapping registers. This prevents virtual machines from accessing other host memory areas.
IOMMU (Isolation of direct memory access addresses)
Tagged transactions in PF. The PF supports request tagging via the Tag field in the PCIe header. Each transaction from the PF or its child VFs receives a unique tag, allowing the hardware to correctly match completing Completions with the initiating request.
Timing parameters of PF. The PF configures Completion Timeout, Split Transaction Limit, and other timings via extended registers. Incorrect configuration of these parameters leads to spurious timeouts and catastrophic failures under high loads.
Virtualization of nested functions via PF. Some PF implementations allow creating not only VFs but also secondary PFs in Shared Virtual Memory (SVM) mode. This is used in graphics adapters and network cards supporting Multi-Function Device.
PF event logging. The PF maintains an internal ring buffer of events, logging resets, PCIe link errors, power state transitions, and VF configuration changes. Access to this buffer is provided through special memory-backed BARs, protected from unauthorized reading.
Firmware update via PF. The PF provides mechanisms for updating internal firmware through built-in flash memory control registers. During the update, the PF temporarily blocks the creation of new VFs and places existing ones into a safe state.
Hot-plug features for PF. The PF correctly handles hot removal and insertion events on the PCIe slot. When a change in card presence is detected, the PF releases all BARs, resets child VFs, and notifies the operating system driver of the need to reconfigure the bus.

Comparisons involving PF

PF vs IO (Input/Output). PF manages physical resources (CPU, memory, devices) via direct hardware instructions, whereas IO handles data exchange between peripherals and the system. PF provides deterministic execution critical for real-time, while IO depends on buses and drivers, introducing unpredictable delays.
PF vs IRQ (Interrupt Request). PF actively allocates and synchronizes physical task execution channels, while IRQ passively signals device events. PF requires explicit core state management, whereas IRQ works asynchronously with priorities. PF is more predictable for scheduling, but IRQ is more efficient for short-term responses.
PF vs DMA (Direct Memory Access). PF directly engages compute units for copying and processing data via CPU registers, while DMA bypasses the CPU, offloading copying to a controller. PF provides control over cache coherence but loads the processor; DMA reduces this load but adds bus configuration complexity.
PF vs MMIO (Memory-Mapped I/O). PF operates on dedicated physical memory addresses and device registers via privileged instructions, whereas MMIO maps peripherals into a unified address space. PF requires page isolation for security, while MMIO simplifies access but is vulnerable to memory barrier errors.
PF vs VF (Virtual Function) in SR-IOV. PF is a full-privilege manager of the physical PCIe device, capable of changing configuration and allocating VFs, while VF is a lightweight guest without initialization rights. PF provides security through isolation but requires a privileged driver; VF gives virtual machines direct access with lower overhead.

OS and driver support

PF (Physical Function) is supported in all modern operating systems (Windows, Linux, FreeBSD) through standard PCIe device drivers. The PF driver implements full access to hardware resources (BAR space, MSI-X, power management, and reset), while VF drivers use lightweight data exchange paths via hardware queues or emulation of configuration space redirected by the PF.

Security

PF provides hardware isolation between virtual functions (VFs) using transactional translation mechanisms (ATS, PASID) and DMA isolation via IOMMU/SMMU. The PF manages VF access to shared resources (queue rings, descriptors) by validating requests at the hardware level and prohibits direct VF access to critical power management or firmware control registers, thereby preventing data leaks and privilege escalation between guests.

Logging

The PF maintains an event log in the form of an internal ring buffer (or transmits via AER, aer_uncorrectable) for all PCIe errors (tag memory cell errors, buffer overflows, request timeouts from VFs), and also logs initialization, VF assignment, and link state changes. Logging data is accessible via system utilities (e.g., lspci -vvv, dmesg, or via debugfs files), and the PF can trigger an interrupt on the root port for fatal errors to initiate an immediate stack dump.

Limitations

The PF limits the maximum number of VFs, set via driver load parameters (e.g., max_vfs) or BIOS firmware, due to the finite number of DMA contexts, queues, and registers. Additionally, the PF cannot be migrated between hosts during live migration without proprietary driver utilities (requiring a reset of all VFs and reinitialization), and most PF devices do not support concurrent use of VFs and SR-IOV when power-saving D3cold is enabled, reducing flexibility in data centers.

History and development

The concept of PF (as part of SR-IOV) was standardized in the PCI-SIG Single Root I/O Virtualization and Sharing rev 1.0 specification (2007), replacing software device emulation. Early practical implementations appeared in Intel 82576 network cards (2008) and later in NVMe controllers, GPUs (NVIDIA vGPU, Intel GVT-g). Development of PF today includes support for VF migration (PCIe r5.0) and hardware acceleration of state exchange via P2P DMA managed by the PF driver without hypervisor involvement.

vGPU (Splitting a GPU into virtual devices)