SMR (Overlapping track recording for increased density)

SMR (Shingled Magnetic Recording) is a method of recording on a hard disk where data tracks overlap each other like shingles on a roof. This allows more information to fit on the same platter, but at the cost of reduced speed during data rewrites.

The technology is in demand in scenarios with a predominance of sequential reading and rare rewriting. This includes archival storage and backup (cold data), cloud data centers, video surveillance systems, and media servers. SMR is also used in budget high-capacity consumer drives for storing movie and photo collections, where capacity is more important than response speed.

Typical problems

The main drawback is a radical drop in performance during random writes. Due to track overlapping, changing one sector requires rewriting the entire shingled group (band). This causes unpredictable delays and low IOPS. Normal OS operations may perceive the drive as frozen. Also observed is speed degradation during prolonged operation (Write Amplification) and difficulty operating in RAID arrays, especially during rebuilds, due to conflicts with the controller firmware.

How SMR works

In classic Perpendicular Magnetic Recording (PMR), the write head creates clearly separated concentric tracks, and the width of the read element is smaller than the width of the write element so that it does not pick up interference from neighbors. The SMR revolution lies in the head deliberately driving over the edge of the adjacent track, leaving only a narrow strip visible (the final read width). This eliminates the need for wide guard spaces, physically placing more tracks on the platter.

The price for densification is the destruction of random write capability. Data is organized not into independent rings, but into zones-bands, resembling a streamer tape. In Drive-Managed SMR mode (the most common for end users), the disk masquerades as a regular one: it creates a cache on a PMR area or on unallocated space. While you write a file, it instantly goes to the cache, and in the background, the controller, similar to a garbage collector in an SSD, reorganizes data into ordered shingled zones. If the cache overflows, performance collapses many times over. Host-Managed SMR exists, where the operating system (for example, via ZFS) manages the bands itself, eliminating blind internal disk reorganization and guaranteeing write predictability. This is the key difference from PMR, where each track is atomic, and from SSD, where rewriting causes block erasure but does not require rewriting physically adjacent data across the entire radius of the platter.

SMR functionality

  1. The principle of shingled recording. Recording is performed not by isolated concentric tracks, but by stripes partially overlapping the previous ones, like roofing shingles. The write track width significantly exceeds the final read track width, allowing the use of a wider and more energy-efficient write element while maintaining density.
  2. Write element geometry. A key feature is the asymmetric pole tip of the magnetic head. One edge of the inductor is made sharp to form a clear track boundary, while the opposite edge is beveled or rounded, creating a zone of gradual field decay that will be overlapped by the next track.
  3. Read sub-track formation. The final read track width is determined by the narrow area unaffected by the overlap. The magnetic read track is formed as the difference between the write zones of two consecutive head passes, requiring precise positioning control.
  4. Inter-Track Interference (ITI) management. Since the write element physically touches the edge of the adjacent track, ITI suppression is a critical task. The digital signal processing system on the controller chip compensates for edge distortions arising from asymmetric field erasure.
  5. Two-dimensional digital signal correction. Unlike traditional drives, SMR uses a two-dimensional equalizer. The signal is read not only from the target track but also from adjacent ones, after which the algorithm subtracts the deliberately recorded interference, restoring the original signal without loss.
  6. Cache organization levels. Due to the impossibility of random rewriting of isolated sectors, a caching hierarchy is introduced. The media cache on the magnetic platter is a dedicated Conventional Magnetic Recording zone, accepting the data stream without delays for shingled band reconstruction.
  7. Zone structure of the media. The platter surface is logically divided into isolated zones-bands. Within a band, writing proceeds sequentially, filling the shingled structure from the outer edge to the inner or vice versa. Between bands, there are guard bands that eliminate overlap conflicts at group boundaries.
  8. Atomic band update mechanism. When a single sector is modified, the controller does not rewrite it in place. It reads the entire band into RAM or non-volatile cache, modifies the target block, and sequentially rewrites the entire band anew into a reserved area in the background.
  9. Garbage management and background operations. The dirty state of a rewritten old band is changed to garbage. The controller initiates a background cleaning procedure, consolidating valid data from several partially filled zones into one clean band via sequential streaming writes.
  10. Host-Managed SMR architecture. HM-SMR devices delegate data placement logic to the operating system. The drive rejects random write commands into a zone, requiring a strictly sequential stream. The host must support Copy-on-Write file systems, such as Btrfs or ZFS.
  11. Drive-Managed SMR. DM-SMR completely hides the shingled nature from the host via firmware-level translation. The controller emulates a random-access block device by remapping LBA addresses to physical zones through a complex abstraction layer with indirection tables.
  12. Logical-spatial shaping implementation. The address translator uses a shaping map that links logical block ranges to physical quasi-sequential segments on the platter. Indirection hides the variable length of shingled stripes and the dynamic redistribution of spare areas.
  13. Dynamic switching between CMR and SMR. Some drives implement a hybrid mode. Part of the platter is formatted for classic CMR for intensive random write operations of metadata, while the main user data is placed in SMR zones with high packing density.
  14. CMR (Longitudinal data recording on magnetic media)
  15. Write channel power consumption. Due to the wider pole width, the current density in the inductive coil required to saturate the magnetic layer decreases. This reduces the peak power consumption of the write preamplifier, which is critically important for storage arrays with high disk placement density.
  16. Domain stability and thermomagnetic relaxation. The shingled method imposes restrictions on the choice of magnetic platter materials. Due to the reduced final read grain width, requirements for coercive force and the energy barrier of thermal stabilization increase to resist the superparamagnetism effect.
  17. Error correction and erasure scheme. Since the update operation affects an entire band, the Low-Density Parity Check (LDPC) error correction code block must have high redundancy to protect against the accumulation of positioning errors. Overlapping may fragment soft metrics processed by the iterative decoder.
  18. Forced write interruption procedure (Write Splice). During an emergency power loss while modifying a band, the integrity of the tail part of the structure is compromised. The firmware uses a power-storing capacitor to complete the atomic operation or marks the band as requiring journal recovery.
  19. Write Amplification effect. A key parasitic effect of SMR. Changing four kilobytes of logical data can cause the physical reconstruction of a 256-megabyte band, which multiplicatively increases the actual volume of data written to the platter, accelerating mechanical wear.
  20. Vibration decoupling requirements. In multi-disk configurations, rotational vibration reduces the positioning accuracy of the beam actuator, which is critical for capturing the narrow read track. SMR drives use a multi-sensor rotational acceleration feedback system to compensate for head displacement.
  21. Thermal gradient utilization. Energy-Assisted SMR technologies use short-term local heating of the media by a laser (HAMR) or a microwave oscillator (MAMR). This temporarily lowers the coercivity of the spot at the moment of pulse application, allowing writing on media with ultra-high anisotropy without the risk of under-overlap with the adjacent layer.
  22. HAMR (Local laser heating for magnetic recording)
  23. Specifics of ATA command timings. The NCQ command queue in SMR is strictly limited. For hosts managing zones, an extended ZAC (Zoned-device ATA Commands) command set has been introduced, allowing directives to open, close, and reset zones, as well as query the write pointer fill status of a specific band.

Comparisons

  • SMR vs CMR (Conventional Magnetic Recording). SMR uses partial track overlap to increase recording density, like shingles, whereas CMR writes data on isolated parallel tracks without intersection. This gives SMR a capacity gain of up to 25%, but leads to a significant speed drop during random rewrite operations due to the need to reorganize the entire data block.
  • SMR vs PMR (Perpendicular Magnetic Recording). Technologically, SMR is an overlay on top of the basic PMR method, which uses the vertical orientation of magnetic domains for bit stability. The difference lies specifically in the recording geometry: an SMR disk writes a new track by shadowing part of the previous one. Unlike pure PMR, this creates fundamental restrictions on the direct modification of random sectors without background cleaning of the entire zone.
  • SMR vs HAMR (Heat-Assisted Magnetic Recording). If SMR optimizes data layout through track overlap, HAMR solves the physical problem of grain stability by laser heating the media to facilitate magnetization reversal. SMR provides a capacity increase through recording architecture, whereas HAMR overcomes the superparamagnetic limit of the platter material, and these technologies can be combined in future drives for a multiplicative density effect.
  • SMR vs EAMR (Energy-Assisted Magnetic Recording). EAMR, including microwave MAMR, changes the physics of the recording process by applying external energy to the head, lowering the coercivity of the media. Unlike EAMR, which requires complex controllers and new heads, SMR is a purely algorithmic (architectural) innovation of data organization. SMR is cheaper to implement since it uses standard PMR heads, but pays for this with a random write performance penalty.
  • EAMR (Local heating for stable recording)
  • SMR vs TDMR (Two-Dimensional Magnetic Recording). SMR fights for density through one-dimensional track overlap in width, while TDMR uses an array of read elements to process two-dimensional inter-symbol noise and adjacent track interference. TDMR improves read quality at ultra-small scales through complex digital signal processing, whereas SMR is a write method that primarily gains in capacity at the cost of complicating I/O operations.

OS and driver support

SMR support is implemented at several levels of the I/O stack: host-managed SMR drives require drivers implementing the Zoned Block Commands (ZBC for SAS/SATA) or Zoned Namespaces (ZNS for NVMe) protocol, where the operating system, through libraries like libzbc, directly manages sequential write zones and reports the write pointer status to the kernel; drive-managed SMR, on the contrary, masquerades as a regular random-access disk, relying on a built-in CMR cache area in the firmware and background garbage collection, which does not require modification of classic drivers but creates unpredictable delays due to internal shingled track cleaning activity.

Security

From a security standpoint, SMR introduces a risk of data compromise during background rewriting of adjacent zones: since the reordering of blocks during zone cleaning erases physical sectors, a sudden power loss without correct head parking and non-volatile cache (PFAIL protection) leads to a violation of the atomicity of the read-modify-write operation for an entire group of tracks, and in host-managed mode, security is implemented through journaling file systems (e.g., F2FS with segment cleaning) that forcibly flush checksums and metadata to a separate reliable area before erasing a zone, guaranteeing data integrity even upon controller crash.

Logging

Logging in an SMR environment transforms from a random-access journal to a tape-like sequential append model: host-managed devices implement a Write-Ahead Log (WAL) via continuous writing into an open zone with a monotonically increasing write pointer, where each transaction receives an offset delta, while drive-managed disks emulate a ring buffer, redirecting logged blocks to the media cache on a CMR section; however, when the cache capacity is exhausted, the controller is forced to demultiplex and physically reorder the stream onto the SMR area, which is visible in SMART logs as an explosive growth of reallocation attributes and delays.

Limitations

The fundamental limitation of SMR stems from the geometry of partial track overlap: the wide write head damages data on adjacent tracks; therefore, any modification of a sector within a filled zone is impossible without preliminary reading and rewriting the entire group of tracks (band). On drive-managed models, this results in random rewrite degradation orders of magnitude below nominal throughput, while host-managed solutions remove this problem at the cost of a strict requirement: the application must write strictly sequentially from the beginning of the zone to its end, prohibiting arbitrary block updates without an explicit RESET WRITE POINTER command.

History and development

SMR technology, conceptually tracing back to multi-track magneto-optical systems of the early 2000s, was commercialized by Seagate in 2013-2014 as a way to overcome the paramagnetic limit without expensive HAMR/MAMR technologies. Initially evolving from completely hidden drive-managed architectures that created chaotic latency in RAID arrays, to the zoned approach standardized by the T10/T13 committee with explicit host management, it paved the way for modern ZNS SSDs and SMR HDDs in hyperscale data centers, where the distributed storage orchestrator (e.g., Ceph) manipulates zones directly, eliminating the layer of uncontrolled garbage collection.