Hot-plug Memory (Dynamic addition of RAM without reboot)

Hot-plug memory is the ability to add RAM modules to a running server without shutting it down or restarting. The operating system automatically detects the new memory and starts using it. This reduces hardware downtime to zero and allows computing resources to be expanded while applications are running.

This technology is used in enterprise server systems and data centers where business continuity is critical. It is supported by servers based on Intel Xeon and AMD EPYC processors running specialized versions of Linux, Windows Server, or hypervisors such as VMware ESXi. The mechanism is also used in fault-tolerant database clusters and cloud platforms that require flexible scaling without interrupting user sessions.

Typical issues

The main difficulty is firmware or driver level errors that cause the system to see the memory but be unable to use it. Incorrect module installation sequence can cause electrical noise on the bus and temporary loss of access to adjacent memory banks. Fragmentation of the physical address space often occurs, where added gigabytes are detected but do not form a contiguous block with existing memory. This leads to inefficient placement of virtual machines or large kernel data structures.

How it works

The dynamic memory addition process relies on the cooperation of the hardware platform, UEFI firmware, and the operating system. Physical module installation begins with a special service procedure: the administrator inserts the module into a predetermined slot on the live motherboard, but before doing so, manually or programmatically sends a attention signal through the management interface. In response to this event, the memory controller temporarily places the target bus into a low-power or electrical isolation mode to avoid current surges and damage to power circuits. Built-in voltage smoothing circuits gradually raise power on the new slot to nominal levels, after which signal lines are calibrated to compensate for changed conductor topology.

Once electrical parameters stabilize, the UEFI firmware initiates module initialization: it reads SPD data from the presence detection chip, determines timings, frequency, and capacity, then generates a new system memory address table. The interrupt controller and chipset receive notification of additional physical address space via ACPI methods. The operating system then takes over: its hot-add event handler invokes the online page pool expansion mechanism. The kernel initializes page structures for the new region in real time, adds them to the memory zone allocator, and makes them available to the virtual memory subsystem. All these operations are performed asynchronously, without stopping the task scheduler or interrupting user processes.

Hot-plug memory functionality

  1. System management controller role. A dedicated microcontroller embedded on the motherboard scans slots for new modules. Upon detecting physical insertion, it reads SPD data, verifies voltage and timing parameters against the current configuration, and then applies power to the new line.
  2. Firmware-level initialization. The ACPI Device Specific Method function activates the power-up procedure. Platform firmware performs memory training, calibrates data lines to compensate for delays, and determines optimal Vref parameters, ensuring stable operation at the target effective frequency.
  3. Address space decoding. The system agent on the CPU recalculates the memory distribution map, assigning new physical addresses. This process requires atomic updates of Target Address Decoder registers to avoid access collisions. The resulting range is kept free of overlap with already reserved I/O spaces.
  4. UEFI system firmware. The DXE phase driver receives notification and generates SRAT and HMAT structures. These tables describe proximity topology to new processor sockets to ensure OS awareness of non-uniform memory access and cache hierarchy before the target environment boots.
  5. Resource handoff to the operating system. The _OST method initiates a new NUMA node insertion transaction. The kernel invokes the ACPI subsystem, which checks Insertion flags, processes the _MAT object, and passes the event to the memory manager for hot-plugging the pglist_data structure.
  6. Zone structure initialization. The memory manager allocates a new node structure and initializes DMA, DMA32, and Normal zones. This process involves populating struct zone arrays, zeroing page counters, and calibrating watermarks to ensure immediate readiness for allocations.
  7. Buddy allocator registration. Freed physical pages are handed to the buddy allocator via __free_pages_boot. Pages are coalesced into higher-order blocks to minimize fragmentation. To prevent errors, control structures are marked with the PG_reserved flag until full validation is complete.
  8. Kernel interface activation. The /sys/devices/system/memory interface provides access to the state and removable attributes of memory block devices. Writing online to the state file of a specific section triggers an atomic transition of the physical range to the online state and binds it to a node.
  9. SCI interrupt handling. The platform generates a system control interrupt when a slot latch is detected. The ACPI handler dispatches execution of the _L01 method, notifying the acpi_memhotplug driver to scan the bus for new child memory devices.
  10. Data integrity verification. Before connection, the kernel may request pattern testing from firmware via the _RST method. Service structures write and read signatures to verify data line integrity and the absence of stuck-at faults, guaranteeing immediate safety for application payloads.
  11. Creation of memory block devices. For each range, a memory_block instance is created with a unique phys_index. The kernel memory subsystem registers the object in sysfs and links it to the notification subsystem for synchronous module enable and disable operations.
  12. x86_64 subsystem operation. On the x86 architecture, arch/x86/mm/init_64 reflects new pages into the kernel direct mapping table. Huge pages are configured by clearing the _PAGE_PRESENT bit and then writing PMD entries using TLB shutdown barriers.
  13. SLUB allocator update. Kernel object service caches recalculate parameters in response to MEM_ONLINE events. The slab_mem_going_online_callback function updates deferred queue limits per CPU, optimizing slot placement closer to the new node to minimize interconnect traffic.
  14. Virtualization integration. The hypervisor emulates hot-add via a virtual PCI memory bridge. The virtio-mem device passes plug commands in blocks, while the host uses page reporting to return unused pages, enabling resource redistribution among virtual machines.
  15. Cache coherence. The home agent on the CPU die registers the new address space in the snoop filter directory. Probe filter transactions ensure that a CPU core access attempt to an unmapped line results in a clean fill request, preventing data races.
  16. OS visibility restriction. The memmap kernel parameter precisely reserves a region for future connection before actual insertion. Firmware marks this region as Hot Pluggable in the UEFI memory map, instructing the kernel to create placeholder page structures and wait for a physical module presence signal.
  17. Procedure fault tolerance. If calibration fails, firmware sets the FW_FAULT flag. The OS receives an error notification via the _OST method with an Ejection Failure code. The damaged resource is isolated in a defective memory pool, and an alert is sent to the administrator via IPMI without impacting running processes.
  18. Interleaving management. If a new module breaks channel symmetry, the system temporarily switches the controller to non-interleaved mode. Firmware changes Map Out bits in memory controller registers, then the kernel migrates critical pages, balancing channel load without stopping the I/O subsystem.
  19. Capacity detection completion. The procedure ends by incrementing the total memory size counter in the totalram_pages variable. The task scheduler receives an updated node load vector, and new processes immediately begin using the fresh resource subject to cpuset policies and interconnect proximity.

Comparisons

  • Hot-plug Memory vs Memory Sparing. Hot-plug memory focuses on physically replacing or adding entire memory modules without stopping the server, ensuring hardware-level continuity. In contrast, memory sparing is a memory rank reservation mechanism that does not require physical intervention at failure time: data from a faulty module is automatically copied to a spare, with hardware replacement occurring later. This reduces urgency but cuts available RAM capacity.
  • Hot-plug Memory vs Memory Mirroring. Hot-plug memory focuses on servicing, allowing modules to be removed and added in a running system using RAID or mirror configurations. Memory mirroring, however, works on the RAID1 principle, creating a full data copy on a separate module, providing instant failover upon uncorrectable error without data degradation. But mirroring halves usable memory capacity, while hot-plug aims at scaling.
  • Hot-plug Memory vs Hot-plug RAID Memory. Standard hot-plug memory typically requires preconfigured mirroring (RAID1) for safe replacement and supports hot-add and hot-replace operations at the module level. HP’s hot-plug RAID memory technology evolves this approach: it uses a RAID5 scheme distributing data and parity across five modules, allowing a full module failure to be tolerated and enabling hot replacement or expansion with less loss of usable capacity.
  • Hot-plug Memory vs Hot-add. The hot-add function is part of the broader hot-plug memory concept, but a key difference exists. Hot-add targets strictly capacity expansion by inserting modules into empty slots and almost always requires OS support for new memory detection. Hot-plug in the broader sense includes hot-replace, where a faulty module is swapped for an identical one without OS involvement because data recovery happens in hardware.
  • Hot-plug Memory vs CXL Device Hotplug. Traditional server hot-plug memory operates with physical DIMM modules and cards within a closed hardware architecture using preset RAID or mirror modes. CXL device hotplug is a modern standard based on the PCIe bus, where memory is added at the peripheral device level. Pre-allocation of resources in platform firmware is critical here; otherwise the device will not be recognized, shifting the focus from physical replacement to software-hardware coordination.

OS and driver support

The OS kernel provides drivers with a special API add_memory_driver_managed, which allows registering memory managed by a device driver (e.g., virtio-mem or kmem). When this method is called, the memory management subsystem does not add entries to the /sys/firmware/memmap firmware map but creates a resource with the IORESOURCE_MEM_DRIVER_MANAGED flag and an appropriate name, enabling the kernel and userspace (kexec-tools) to correctly identify such memory and exclude it from dumps and initial fixed allocation.

Security

To prevent attacks from potentially compromised devices during hot memory addition, a strict isolation mechanism is used via IOMMU and bounce buffer technology: for untrusted PCI devices, operating system drivers forcibly copy DMA operation data into isolated memory pages, verify them, and only then map them through the IOMMU so that the physical device cannot access data from adjacent kernel pages, and also immediately invalidate IOMMU cache entries after operation completion.

Logging

The hot memory addition process is logged in the system ring buffer via the memory_hotplug subsystem: upon detection of an ACPI event or virtual request, the kernel outputs the range of added physical addresses, the NUMA node ID, and the status of online sections. If validation errors or zone intersection conflicts occur (as when fixing the register_mem_sect_under_node bug in PowerPC LPAR), a BUG_ON trace is generated with resource state details for subsequent analysis.

Limitations

The technology has architectural and granularity limitations: on x86_64, the minimum hot-replace block size based on DIMM emulation is 128 MB due to firmware and memory manager alignment requirements, while the paravirtualized virtio-mem mechanism can operate with 4 MB blocks. However, both approaches are constrained by the memmap service structure allocation limit (up to 64 bytes per 4 KB page) and require free physical RAM at addition time, imposing heuristic limits on the maximum amount of simultaneous hot-plug.

History and evolution

The evolution of this subsystem has progressed from virtio-balloon implementations (2008), which only simulated memory addition by inflating and deflating a balloon inside the guest OS without NUMA support, to hardware-emulated DIMM devices in QEMU (2015) requiring ACPI support, and finally to modern high-granularity virtio-mem (2020), whose driver is integrated into the kernel via the add_memory_driver_managed function for safe on-the-fly memory distribution coordination between the host and multiple virtual machines.