DEV Community

Ripan Deuri
Ripan Deuri

Posted on

Understanding PCIe Physical Layer

The PCIe Physical Layer is responsible for converting structured packet data into high-speed electrical signals that can traverse the link between devices. While higher layers define what to send, the Physical Layer determines how those bits are transmitted reliably over a noisy channel.

This article focuses on the transmit (TX) path, breaking down each stage from TLP data to differential signaling on the wire, including scrambling, encoding, and serialization.


PCIe Physical Layer

The Root Complex (RC) and Endpoint (EP) are connected by a PCIe Link — a set of high-speed differential signal pairs. Everything that happens between them (register reads, DMA transfers, interrupts) is ultimately carried as Transaction Layer Packets (TLPs) over this link.

+-------------+                             +-------------+
|             |<====== PCIe Link ========>  |             |
|     RC      |                             |     EP      |
|             |                             |             |
+-------------+                             +-------------+
Enter fullscreen mode Exit fullscreen mode

PCIe Link

A PCIe lane is a full-duplex serial connection consisting of one transmit pair and one receive pair.

A PCIe link consists of one or more such lanes aggregated together (x1, x2, x4, x8, x16).

Physical wires in one PCIe lane

RC (Transmitter)                     EP (Receiver)

TX+  ----------------------------->  RX+
TX-  ----------------------------->  RX-

RX+  <-----------------------------  TX+
RX-  <-----------------------------  TX-
Enter fullscreen mode Exit fullscreen mode

Each lane is full-duplex, meaning data flows simultaneously in both directions.

Differential Signaling

PCIe uses low-voltage differential signaling, where the same signal is transmitted on two wires with opposite polarity.

  • TX+ carries the signal
  • TX- carries the inverted signal
  • The receiver subtracts the two → noise cancels out

This provides:

  • High noise immunity
  • Better signal integrity at high speeds

Conceptually:

  • Transmitting ‘1’: TX+ > TX- → positive differential
  • Transmitting ‘0’: TX+ < TX- → negative differential

(The exact voltage swing is small — typically a few hundred millivolts — enabling high-speed operation with low power.)

Lanes — Bundling for Bandwidth

A single lane provides a fixed bandwidth. PCIe increases throughput by aggregating multiple lanes.

x1 Link (1 Lane)

[ Lane 0 ]
TX pair  --->  
RX pair  <---
Enter fullscreen mode Exit fullscreen mode

x2 Link (2 Lanes)

[ Lane 0 ]          [ Lane 1 ]
TX pair --->        TX pair --->
RX pair <---        RX pair <---
Enter fullscreen mode Exit fullscreen mode

Data is striped across lanes in a round-robin manner, and the receiver reassembles it using alignment mechanisms.

Transmit Path

[TLP Stream]
     |
     v
[Framing / Control Symbols]
     |
     v
[Scrambler]
     |
     v
[Encoder]
     |
     v
[Serializer]
     |
     v
[Differential Driver]
     |
     v
TX+ / TX-
Enter fullscreen mode Exit fullscreen mode

Step 1: Framing and Control Symbols

The Data Link Layer passes a sequence of bytes:

[Seq# | TLP Header | Data | LCRC]
Enter fullscreen mode Exit fullscreen mode

TLP boundaries are represented using control symbols:

  • STP (Start of TLP)
  • END (End of TLP)

In Gen1/Gen2, these are explicit symbols embedded in the encoded stream.
In Gen3 and later, boundaries are encoded within control blocks using sync headers and ordered sets.

So the conceptual view becomes:

[STP | Seq# | Header | Data | LCRC | END]
Enter fullscreen mode Exit fullscreen mode

Step 2: Scrambler

Why scrambling is needed

  1. Transition density
  • Clock recovery requires frequent signal transitions
  • Long runs of 0s or 1s break timing recovery
  1. EMI reduction
  • Repetitive patterns create strong electromagnetic emissions
  • Scrambling randomizes the spectrum

How it works

The scrambler XORs data with a pseudo-random sequence generated by an LFSR (Linear Feedback Shift Register).

Transmitter: data ⊕ lfsr → scrambled  
Receiver:    scrambled ⊕ lfsr → data
Enter fullscreen mode Exit fullscreen mode

PCIe Gen3+ uses a 23-bit LFSR.

Example (simplified LFSR)

Initial register:

[S3 S2 S1 S0] = 1 0 0 1
Polynomial: x⁴ + x + 1
Enter fullscreen mode Exit fullscreen mode

LFSR sequence generation

Cycle Register Feedback
0 1001
1 1100 1 XOR 1 = 0
2 0110 1 XOR 0 = 1
3 1011 0 XOR 0 = 0
4 0101 1 XOR 1 = 0
5 1010 0 XOR 1 = 1
6 1101 1 XOR 0 = 1
7 1110 1 XOR 1 = 0

Generated sequence (LSB output):

1 0 0 1 1 0 1 0
Enter fullscreen mode Exit fullscreen mode

Effect on data

Original:   0 0 0 0 0 0 0 0
LFSR:       1 0 0 1 1 0 1 0
Scrambled:  1 0 0 1 1 0 1 0
Enter fullscreen mode Exit fullscreen mode

Now the signal has rich transitions.

Step 3: Encoder

The encoder provides structure for clock recovery and alignment.

8b/10b Encoding (Gen1, Gen2)

  • Each 8-bit byte → 10-bit symbol
  • Guarantees:

    • Bounded run length
    • DC balance

128b/130b Encoding (Gen3, Gen4)

Instead of per-byte encoding:

[Sync Header] + [128-bit scrambled data] = 130 bits
Enter fullscreen mode Exit fullscreen mode
  • Sync Header = 01 (data) or 10 (control)
  • Guarantees a transition at block start

This improves efficiency:

  • 8b/10b → 80% efficiency
  • 128b/130b → ~98.5% efficiency

Step 4: Serializer

The serializer converts parallel data into a serial bit stream.

[130-bit block]
      |
      v
1 → 0 → 1 → 1 → 0 → ... (bit stream)
Enter fullscreen mode Exit fullscreen mode

For PCIe Gen3:

  • Data rate = 8 GT/s
  • Unit Interval (UI) ≈ 125 ps per bit

Internally, serializers use high-speed clocking derived from a reference clock (commonly 100 MHz) via a PLL.

For multi-lane links:

Lane 0 → Serializer 0
Lane 1 → Serializer 1
Enter fullscreen mode Exit fullscreen mode

Data is distributed across lanes and transmitted in parallel.

Step 5: Differential Driver

The serialized bits drive a differential output stage:

Bit = 1 → TX+ > TX-
Bit = 0 → TX+ < TX-
Enter fullscreen mode Exit fullscreen mode

This produces a high-speed differential signal on the wire.

Receive Path (High-Level)

RX+ / RX-
     |
     v
[Analog Front-End + Equalization]
     |
     v
[Clock Data Recovery (CDR)]
     |
     v
[Deserializer]
     |
     v
[Decoder]
     |
     v
[De-scrambler]
     |
     v
[TLP Stream]
Enter fullscreen mode Exit fullscreen mode

Key components:

  • Equalization (CTLE, DFE) → compensates channel loss
  • CDR → extracts clock from data transitions
  • Lane alignment → reconstructs multi-lane streams

Top comments (0)