DEV Community

Ripan Deuri
Ripan Deuri

Posted on

PCIe Overview

PCIe is widely used across modern computing systems, powering devices such as SSDs, GPUs, and network interfaces.

This article provides a structured overview of PCIe with an emphasis on practical understanding. It covers key concepts including topology, protocol layers, BARs, DMA, and interrupt mechanisms

What is PCIe

PCIe (Peripheral Component Interconnect Express) is a high-speed serial bus standard used to connect peripheral devices to the main processor.

The older PCI bus was parallel—it had 32 or 64 data lines all switching simultaneously. However, parallel buses have a fundamental limitation at high frequencies: signals on different wires arrive at slightly different times (called skew), making it difficult to scale to higher speeds.

PCIe uses a high-speed serialized bitstream over differential pairs of wires (called lanes). Each lane consists of one differential pair for transmit and one for receive. Because there are only two wires per direction per lane, PCIe can operate at much higher frequencies without skew-related issues.

Feature PCI PCIe
Signal Type Parallel (32/64 wires) Serial (differential pair per lane)
Topology Shared bus Point-to-point (switched fabric)
Max Bandwidth ~533 MB/s Up to ~128 GB/s (Gen5 x16)
Interrupt IRQ Lines MSI / MSI-X

PCIe uses a point-to-point topology. Each device connects through a dedicated link to the Root Complex or via switches. There is no shared electrical bus; instead, PCIe forms a hierarchical switched interconnect.

Basic PCIe Topology

+-----------------------------------------------+
|                   SoC                         |
|                                               |
|  +-------+            +-------------------+   |
|  |  CPU  |<---------->| Root Complex (RC) |   |
|  +-------+  AXI bus   |                   |   |
|                       |                   |   |
|  +-------+            |                   |   |
|  | DRAM  |<---------->|                   |   |
|  +-------+  AXI bus   +---------^---------+   |
|                                 |             |
+---------------------------------|-------------+
                                  |
                        =====================
                              PCIe Link
                        =====================
                                  |
                        +---------v---------+
                        |   Endpoint (EP)   |
                        +-------------------+
Enter fullscreen mode Exit fullscreen mode

Key Components:

Root Complex (RC):
The Root Complex is the PCIe host controller inside the SoC. It acts as a bridge between the CPU’s memory bus (AXI/AHB) and the PCIe fabric. It:

  • Initiates configuration space reads/writes during enumeration
  • Translates CPU memory accesses into PCIe TLPs (Transaction Layer Packets)
  • Receives TLPs from endpoints and converts them into memory transactions

Endpoint (EP):

  • Exposes registers and memory regions via BARs (Base Address Registers)
  • Can initiate DMA transfers to/from host memory
  • Signals the CPU using MSI/MSI-X interrupts

PCIe Switch (optional):
A PCIe switch expands a single upstream port into multiple downstream ports, allowing multiple endpoints to connect. This creates a hierarchical topology rather than a simple bus.

PCIe Link Layers

PCIe uses a layered architecture:

+------------------------+
| Transaction Layer (TL) |
+------------------------+
| Data Link Layer (DLL)  |
+------------------------+
| Physical Layer (PL)    |
+------------------------+
Enter fullscreen mode Exit fullscreen mode

Physical Layer (PL):

  • Transmits serialized data over differential pairs (TX+/TX- and RX+/RX-)
  • Uses encoding schemes (8b/10b for Gen1/2, 128b/130b for Gen3+) to embed clock information
  • Performs link training using LTSSM (Link Training and Status State Machine), negotiating lane width, speed, and equalization parameters

Data Link Layer (DLL):

  • Adds sequence numbers and LCRC (Link CRC) to ensure data integrity
  • Implements ACK/NAK-based retransmission for reliable delivery

Transaction Layer (TL):

  • Creates and processes Transaction Layer Packets (TLPs)
  • Supports different types of transactions:

    • Memory Read (Non-posted): Request is sent, and a Completion TLP returns data
    • Memory Write (Posted): Sent without requiring a completion
    • Configuration Read/Write: Used during enumeration
    • Message TLPs: Used for MSI/MSI-X interrupts

PCIe Lanes and Link Speed

A PCIe lane consists of one differential pair for transmit and one for receive, enabling full-duplex communication.

Link Width Notation Typical Use Case
1 lane x1 Low-bandwidth peripherals
4 lanes x4 SSDs, NICs
8 lanes x8 High-performance NICs
16 lanes x16 Graphics cards (GPUs)

PCIe generations define per-lane throughput:

Generation Raw Rate Effective Bandwidth per Lane
Gen1 2.5 GT/s 250 MB/s
Gen2 5 GT/s 500 MB/s
Gen3 8 GT/s ~985 MB/s
Gen4 16 GT/s ~1969 MB/s

Gen1/Gen2 use 8b/10b encoding, while Gen3 and above use 128b/130b encoding, which improves efficiency.

Example:
A PCIe Gen3 x2 link provides ~2 GB/s bandwidth in each direction.

PCIe Configuration Space

PCIe devices expose a configuration space used by the host (Root Complex) to discover devices, read capabilities, and configure them.

The first 256 bytes follow the legacy PCI configuration format for backward compatibility. PCIe extends this to a total of 4 KB.

Standard 256-byte PCI Configuration Space:

00: VendorID  DeviceID
04: Command   Status
08: RevID ProgIF SubClass ClassCode
0C: CacheLine LatTimer HeaderType BIST

10: BAR0
14: BAR1
18: BAR2
1C: BAR3
20: BAR4
24: BAR5

28: CardBus CIS
2C: SubVendor  SubDevice
30: Expansion ROM
34: CapPtr
38: Reserved
3C: IRQLine IRQPin MinGnt MaxLat

40-FF: Capabilities / Device specific registers
Enter fullscreen mode Exit fullscreen mode

Header Types:

HeaderType Device Type
0x00 Endpoint
0x01 PCI-to-PCI Bridge

Bridges include bus routing registers assigned during enumeration:

0x10  BAR0
0x14  BAR1

0x18  Primary Bus Number
0x19  Secondary Bus Number
0x1A  Subordinate Bus Number
Enter fullscreen mode Exit fullscreen mode

BAR - Base Address Register

A BAR (Base Address Register) allows a PCIe endpoint to expose memory or register regions to the host.

Each endpoint can have up to 6 BARs. Each BAR defines:

  • Type: Memory (MMIO) or legacy I/O
  • Size: Power-of-two region size
  • Prefetchable: Indicates reads have no side effects and can be cached/prefetched
  • Address Width: 32-bit or 64-bit

BAR sizing mechanism:

During enumeration, the OS writes 0xFFFFFFFF to a BAR and reads it back. The device returns a mask indicating the size.

Example:

  • Returned value: 0xFFFF0000
  • Size: 64 KB

After sizing, the OS assigns a physical address. The driver maps it into virtual space using functions like pci_iomap().

MSI / MSI-X

Traditional PCI used dedicated interrupt lines (physical wires). PCIe replaces these with message-based interrupts.

How MSI works:

During setup, the device is programmed with:

  • A target address in host memory (interrupt controller region)
  • A data value

When the device generates an interrupt:

  • It sends a Memory Write TLP to that address with the data
  • The CPU’s interrupt controller interprets this as an interrupt

DMA

DMA allows devices to directly access system memory without CPU involvement.

Types of DMA:

  • Inbound DMA (Device → Host):
    Device sends Memory Write TLPs to host memory

  • Outbound DMA (Device ← Host):
    Device sends Memory Read TLPs and receives Completion data

Addressing:

  • CPU uses virtual addresses (via MMU)
  • Devices use:

    • Physical addresses, or
    • IOMMU-translated IO virtual addresses (IOVA)

Top comments (0)