PCIe is widely used across modern computing systems, powering devices such as SSDs, GPUs, and network interfaces.
This article provides a structured overview of PCIe with an emphasis on practical understanding. It covers key concepts including topology, protocol layers, BARs, DMA, and interrupt mechanisms
What is PCIe
PCIe (Peripheral Component Interconnect Express) is a high-speed serial bus standard used to connect peripheral devices to the main processor.
The older PCI bus was parallel—it had 32 or 64 data lines all switching simultaneously. However, parallel buses have a fundamental limitation at high frequencies: signals on different wires arrive at slightly different times (called skew), making it difficult to scale to higher speeds.
PCIe uses a high-speed serialized bitstream over differential pairs of wires (called lanes). Each lane consists of one differential pair for transmit and one for receive. Because there are only two wires per direction per lane, PCIe can operate at much higher frequencies without skew-related issues.
| Feature | PCI | PCIe |
|---|---|---|
| Signal Type | Parallel (32/64 wires) | Serial (differential pair per lane) |
| Topology | Shared bus | Point-to-point (switched fabric) |
| Max Bandwidth | ~533 MB/s | Up to ~128 GB/s (Gen5 x16) |
| Interrupt | IRQ Lines | MSI / MSI-X |
PCIe uses a point-to-point topology. Each device connects through a dedicated link to the Root Complex or via switches. There is no shared electrical bus; instead, PCIe forms a hierarchical switched interconnect.
Basic PCIe Topology
+-----------------------------------------------+
| SoC |
| |
| +-------+ +-------------------+ |
| | CPU |<---------->| Root Complex (RC) | |
| +-------+ AXI bus | | |
| | | |
| +-------+ | | |
| | DRAM |<---------->| | |
| +-------+ AXI bus +---------^---------+ |
| | |
+---------------------------------|-------------+
|
=====================
PCIe Link
=====================
|
+---------v---------+
| Endpoint (EP) |
+-------------------+
Key Components:
Root Complex (RC):
The Root Complex is the PCIe host controller inside the SoC. It acts as a bridge between the CPU’s memory bus (AXI/AHB) and the PCIe fabric. It:
- Initiates configuration space reads/writes during enumeration
- Translates CPU memory accesses into PCIe TLPs (Transaction Layer Packets)
- Receives TLPs from endpoints and converts them into memory transactions
Endpoint (EP):
- Exposes registers and memory regions via BARs (Base Address Registers)
- Can initiate DMA transfers to/from host memory
- Signals the CPU using MSI/MSI-X interrupts
PCIe Switch (optional):
A PCIe switch expands a single upstream port into multiple downstream ports, allowing multiple endpoints to connect. This creates a hierarchical topology rather than a simple bus.
PCIe Link Layers
PCIe uses a layered architecture:
+------------------------+
| Transaction Layer (TL) |
+------------------------+
| Data Link Layer (DLL) |
+------------------------+
| Physical Layer (PL) |
+------------------------+
Physical Layer (PL):
- Transmits serialized data over differential pairs (TX+/TX- and RX+/RX-)
- Uses encoding schemes (8b/10b for Gen1/2, 128b/130b for Gen3+) to embed clock information
- Performs link training using LTSSM (Link Training and Status State Machine), negotiating lane width, speed, and equalization parameters
Data Link Layer (DLL):
- Adds sequence numbers and LCRC (Link CRC) to ensure data integrity
- Implements ACK/NAK-based retransmission for reliable delivery
Transaction Layer (TL):
- Creates and processes Transaction Layer Packets (TLPs)
-
Supports different types of transactions:
- Memory Read (Non-posted): Request is sent, and a Completion TLP returns data
- Memory Write (Posted): Sent without requiring a completion
- Configuration Read/Write: Used during enumeration
- Message TLPs: Used for MSI/MSI-X interrupts
PCIe Lanes and Link Speed
A PCIe lane consists of one differential pair for transmit and one for receive, enabling full-duplex communication.
| Link Width | Notation | Typical Use Case |
|---|---|---|
| 1 lane | x1 | Low-bandwidth peripherals |
| 4 lanes | x4 | SSDs, NICs |
| 8 lanes | x8 | High-performance NICs |
| 16 lanes | x16 | Graphics cards (GPUs) |
PCIe generations define per-lane throughput:
| Generation | Raw Rate | Effective Bandwidth per Lane |
|---|---|---|
| Gen1 | 2.5 GT/s | 250 MB/s |
| Gen2 | 5 GT/s | 500 MB/s |
| Gen3 | 8 GT/s | ~985 MB/s |
| Gen4 | 16 GT/s | ~1969 MB/s |
Gen1/Gen2 use 8b/10b encoding, while Gen3 and above use 128b/130b encoding, which improves efficiency.
Example:
A PCIe Gen3 x2 link provides ~2 GB/s bandwidth in each direction.
PCIe Configuration Space
PCIe devices expose a configuration space used by the host (Root Complex) to discover devices, read capabilities, and configure them.
The first 256 bytes follow the legacy PCI configuration format for backward compatibility. PCIe extends this to a total of 4 KB.
Standard 256-byte PCI Configuration Space:
00: VendorID DeviceID
04: Command Status
08: RevID ProgIF SubClass ClassCode
0C: CacheLine LatTimer HeaderType BIST
10: BAR0
14: BAR1
18: BAR2
1C: BAR3
20: BAR4
24: BAR5
28: CardBus CIS
2C: SubVendor SubDevice
30: Expansion ROM
34: CapPtr
38: Reserved
3C: IRQLine IRQPin MinGnt MaxLat
40-FF: Capabilities / Device specific registers
Header Types:
| HeaderType | Device Type |
|---|---|
| 0x00 | Endpoint |
| 0x01 | PCI-to-PCI Bridge |
Bridges include bus routing registers assigned during enumeration:
0x10 BAR0
0x14 BAR1
0x18 Primary Bus Number
0x19 Secondary Bus Number
0x1A Subordinate Bus Number
BAR - Base Address Register
A BAR (Base Address Register) allows a PCIe endpoint to expose memory or register regions to the host.
Each endpoint can have up to 6 BARs. Each BAR defines:
- Type: Memory (MMIO) or legacy I/O
- Size: Power-of-two region size
- Prefetchable: Indicates reads have no side effects and can be cached/prefetched
- Address Width: 32-bit or 64-bit
BAR sizing mechanism:
During enumeration, the OS writes 0xFFFFFFFF to a BAR and reads it back. The device returns a mask indicating the size.
Example:
- Returned value:
0xFFFF0000 - Size: 64 KB
After sizing, the OS assigns a physical address. The driver maps it into virtual space using functions like pci_iomap().
MSI / MSI-X
Traditional PCI used dedicated interrupt lines (physical wires). PCIe replaces these with message-based interrupts.
How MSI works:
During setup, the device is programmed with:
- A target address in host memory (interrupt controller region)
- A data value
When the device generates an interrupt:
- It sends a Memory Write TLP to that address with the data
- The CPU’s interrupt controller interprets this as an interrupt
DMA
DMA allows devices to directly access system memory without CPU involvement.
Types of DMA:
Inbound DMA (Device → Host):
Device sends Memory Write TLPs to host memoryOutbound DMA (Device ← Host):
Device sends Memory Read TLPs and receives Completion data
Addressing:
- CPU uses virtual addresses (via MMU)
-
Devices use:
- Physical addresses, or
- IOMMU-translated IO virtual addresses (IOVA)
Top comments (0)