RISC-V (Open modular instruction set architecture)

RISC-V is a free and open machine language standard through which a processor communicates with programs. Unlike proprietary counterparts, it can be used without royalty payments, modifying the chip design to suit one’s own needs, which resembles the open-source principle in software.

The architecture is applied in embedded systems, microcontrollers for the Internet of Things, and sensors where energy efficiency is important. AI accelerators, specialized coprocessors for data centers, and automotive controllers are built on its basis. The technology also finds a place in academia for teaching processor design and in spacecraft due to the ability to create radiation-hardened solutions without external licensing restrictions.

The main difficulty is tied to ecosystem fragmentation: the freedom to modify the basic instruction set can lead to software incompatibility between different chips when extensions are managed incorrectly. Compiler performance and code density may still lag behind proprietary architectures deeply optimized over years. There is also a risk of hidden bugs in new custom extensions, requiring exceptionally thorough verification at all development stages.

How RISC-V works

The operating principle is built on a modular foundation: the mandatory base integer instruction set (RV32I or RV64I) contains only about 40 instructions sufficient for running an operating system, while all additional capabilities including multiplication, floating-point numbers, atomic operations, and vector processing are relegated to standardized optional extensions. This is fundamentally different from the x86 architecture, where complex instructions are decoded into simple micro-operations inside the processor hidden from the developer, creating a huge and power-hungry decoder. Unlike ARM, RISC-V is not tied to a specific developer and does not require an architectural license fee, allowing any experimental schemes to be implemented instead of the classic pipeline with branch prediction. The architecture uses a simplified fixed-length instruction format and a weak memory ordering model, where fence instructions are explicitly applied for synchronization, giving freedom to reorder data access in multiprocessor configurations. Thanks to the concept of privilege levels controlling access to system resources through control and status registers, the same core can efficiently switch between user application and hypervisor modes, while the simplicity of the base logic minimizes die area and static power consumption compared to heavyweight CISC solutions.

RISC-V functionality

  1. Register file architecture. The base specification defines 31 general-purpose registers x1 through x31 and a zero register x0 hardwired to the constant 0. The register bit width matches the architecture bit width (XLEN=32, 64, or 128), eliminating the need to emulate data width at the hardware level.
  2. Zero register encoding. Register x0 is not a physical memory cell. Any write to x0 is ignored, and a read always returns the zero value. This allows implementing copy operations (addi rd, rs, 0), comparison with zero, and constant generation without allocating special commands.
  3. Program counter as an implicit register. The instruction pointer pc is not part of the register file and is not accessible for direct writing through general opcodes. Control flow is exercised only through conditional branch and jal/jalr instructions that calculate the target address relative to the current pc value.
  4. Base integer set RV32I. The RV32I set contains 40 unique instructions sufficient to support a full-fledged compiler. It includes register-register operations, immediate value operations, conditional branches, unconditional jumps, word, halfword, and byte loads and stores, as well as ecall/ebreak system calls.
  5. Immediate value encoding. To minimize data path multiplexing, the immediate operand bits in I-, S-, B-, U-, and J-type formats are always extracted from the same instruction positions. The hardware decoder rearranges bits according to a fixed scheme, sign-extending the result to XLEN width.
  6. Fixed-length instruction formats. The length of base instructions is strictly 16 bits for the compressed set and 32 bits for the standard set. The upper 2 bits of the lower word determine the length: values 00, 01, 10 correspond to 16-bit instructions, and 11 to 32-bit instructions, which accelerates pre-decoding in superscalar cores.
  7. Compressed instruction set RV32C. Extension C provides narrow 16-bit code sequences for frequently used operations. Translation into 32-bit equivalents occurs at the decoding stage transparently to the pipeline, increasing code density without modifying execution units.
  8. Semantics of arithmetic operations. The absence of condition flags forces condition evaluation through separate instructions. The slt, sltu, slti, sltiu commands perform signed and unsigned comparison writing a Boolean result, while add/sub/addi operations ignore carry and overflow without generating exceptions.
  9. Unconditional jump mechanism. The jal instruction writes the return address (pc+4) to register rd and offsets pc by a 20-bit signed displacement. The register form jalr adds an immediate value to the contents of rs1, clearing the least significant bit of the result to align the jump address.
  10. Conditional branch organization. Branching is implemented by comparing two registers: beq, bne, blt, bge, bltu, bgeu. The offset is encoded in B-type format and multiplied by 2. Branch prediction remains a microarchitectural prerogative, not specified at the ISA level.
  11. Memory addressing and alignment. Loads and stores operate on bytes, halfwords, and words through lb/lbu/lh/lhu/lw/sw/sh/sb. The specification recommends natural alignment but allows unaligned access depending on the implementation, which may raise an exception or handle the transaction in hardware.
  12. Weak memory ordering model. The Ztso extension defines a total store ordering shared memory model. In the base variant, the architecture prescribes the use of FENCE and FENCE.I instructions to explicitly delineate the order of I/O operations, memory writes, and instruction stream synchronization.
  13. Atomic operation set. Extension A adds lr.w/sc.w instructions for building non-blocking synchronization primitives, as well as atomic amoswap, amoadd, amoand, amoor, amoxor, amomin, amomax and their signed variants. Read-modify-write bus cycle atomicity is guaranteed.
  14. Multiplication and division in extension M. RV32M adds mul, mulh, mulhu, mulhsu, div, divu, rem, remu operations. Hardware dividers are optional; in the absence of M, the standard permits emulation through software libraries with identical round-toward-zero semantics for signed division.
  15. Machine Mode privilege level. The mandatory M-mode provides access to the mstatus status register, mcause exception cause register, mtval fault address register, and the mtvec interrupt vector table. The interrupt controller implements direct or vectored dispatching depending on the mtvec configuration.
  16. Virtual memory and Sv39/Sv48. The privileged specification defines multi-level page tables. Sv39 mode uses three-level translation of 39-bit virtual addresses to 56-bit physical addresses. The hardware TLB caches translations, being flushed by the sfence.vma instruction.
  17. Trap and interrupt handling. Upon entering an exception, the hardware saves pc to mepc, sets the interrupt bit in mcause, and jumps to the address from mtvec. Deferred context saving is left to the software handler, which eliminates implicit costly saves of all registers upon interrupt entry.
  18. Floating-point operations. Extensions F and D operate on single- and double-precision scalar values in the f0 through f31 register file. fmadd, fmsub, fnmsub, fnmadd are supported with unlimited internal product without intermediate rounding, conforming to the IEEE 754-2008 standard.
  19. Vector extension RVV 1.0. The architecture defines variable-length registers v0 through v31, where VLEN can reach 65536 bits. The vector length configuration is set via the vl register, and the element type via vtype. Instructions operate with masking under v0 and support strided, indexed, and segmented addressing.
  20. Vector (Ordered storage of numbers in continuous memory)
  21. Cryptographic instructions Zk. The Zknd, Zkne, Zknh, Zksed, Zksh extensions introduce hardware acceleration for AES, SM4, and SHA-256. The operations perform substitutions, permutations, non-linear transformations, and round key schedules in a minimal number of cycles, unattainable with classical software emulation.

Comparisons

  • RISC-V vs ARM. RISC-V is an open modular architecture without royalty payments, allowing the instruction set to be freely extended, whereas ARM is a proprietary platform with closed extensions and mandatory core licensing. This distinction fundamentally changes the economics of development, eliminating entry barriers and ensuring full control over the microarchitecture without the legal risks inherent in license agreements.
  • RISC-V vs x86. The RISC-V architecture is inherently built on the principles of energy-efficient RISC design with simple instruction decoding, unlike x86 with its complex CISC nature and hardware translation into microcodes. The rejection of backward compatibility legacy allowed RISC-V to avoid redundancy accumulated over decades, ensuring core compactness and a more predictable pipeline without the need to support obsolete addressing modes.
  • RISC-V vs MIPS. Unlike the commercially unstable MIPS architecture, which went through a series of acquisitions, RISC-V is governed by the global non-profit community RISC-V International. Although both systems are based on the load-store concept, RISC-V eliminated such MIPS shortcomings as branch delay slots and architectural registers for interrupts, replacing them with a modern compact design oriented towards modularity and compilation efficiency.
  • MIPS (Simplified pipelined RISC architecture without interlocks)
  • RISC-V vs OpenRISC. OpenRISC historically was the first attempt to create a free processor core under the GPL license, however RISC-V offered a more elaborate and formally verified instruction set specification. The key advantage of RISC-V was industry support, which enabled the creation of mature toolchains and compilers, while OpenRISC remained an academic project without wide adoption in commercial mass products.
  • RISC-V vs SPARC. The SPARC architecture, also once released for open use, employed a complex register file with overlapping windows, which created overhead during context switching, whereas RISC-V uses a flat register file that simplifies microarchitecture design. RISC-V revised the approach to open hardware, making it accessible not only for server solutions but also for microcontrollers with aggressive power saving.
  • SPARC (Open standard RISC architecture)

RISC-V and OS

The RISC-V architecture provides several privilege levels (M, S, U), where Machine Mode manages hardware resources, Supervisor Mode executes the OS kernel, and User Mode isolates applications, which allows implementing standard POSIX interfaces through the SBI specification and direct access to virtual memory management registers. The Linux kernel with RISC-V support uses the standard PLIC interrupt mechanism and CLINT timer, while peripheral device drivers interact through device trees without the need for proprietary binary blobs, which is ensured by the open TileLink or AXI bus specification in SoC implementations.

Security in the RISC-V ecosystem

Security functions are based on the standard PMP (Physical Memory Protection) extension, which configures memory regions with read, write, and execute attributes for Machine Mode, preventing unauthorized access even by privileged code. The additional Cryptography extension (K-extension) implements hardware acceleration of AES and SHA algorithms through special instructions working directly with the register file, while the upcoming IOPMP specification will strengthen DMA device protection, and the Trusted Execution Environments technology is built on enclave isolation through multi-zone memory managers without increasing the trusted computing base.

Logging and debugging principles in RISC-V

The debug standard defines a Debug Module that connects to the core through abstract commands and uses the JTAG transport layer for non-invasive access to the Program Buffer without stopping the execution of critical threads. Processor event logging is implemented through the Processor Trace Encoder, which generates packets with information about branches and instruction fetch, serializing them into standard trace formats through an Aurora-like interface, and the Trigger Module specification allows setting hardware breakpoints on instruction fetch, memory operations, or system events with precise pipeline binding.

Limitations

Ecosystem fragmentation arises from excessive freedom in defining custom extensions, which is solved by platform profiling (RVA, RVB) for binary code unification at the OS kernel level using configuration flags in MISA machine registers. Latency in executing complex operations such as division or atomic transactions is compensated by offloading logic to coprocessors connected via the RoCC interface with a minimal single-cycle delay, while the immaturity of server peripherals is mitigated by the modular IOMMU specification implemented at the hardware level in modern SoC designs.

History of RISC-V

The architecture originated in 2010 at the University of California, Berkeley as the fifth generation of RISC processors for educational and research purposes, initially implementing a 32-bit address space with a concise basic instruction set that eliminated the microarchitectural constraints of its predecessors. The RISC-V International foundation standardized the basic modular specifications and vector extensions, which enabled the transition from microcontroller implementations to multiprocessor configurations with cache coherence based on TileLink and the introduction of the hypervisor mode (H-extension) for full virtualization in Linux environments.