Buy-side and sell-side
Buy-side participants:
Institutional investors
Digital asset funds
Retail traders
Proprietary trading firms
Acquire digital assets for their investment portfolios
Focus on strategy development, research, and trade execution to generate returns
Sell-side firms:
Market makers and brokers
Facilitate trade execution and order routing
Digital asset exchanges:
Provide platforms where assets are listed and traded
Together with sell-side firms, form the core trading infrastructure
Ensure liquidity, match orders, and offer essential financial services
Generate revenue primarily through fees and bid-ask spreads
Tick-to-trade performance
High-frequency, low-latency trading (HFT) is rapidly emerging in centralized digital assets trading.
HFT has been a critical component of the financial market infrastructure across equities, commodities, and foreign exchange markets, as well as derivatives markets including futures, for decades.
Core function in providing market participants increased liquidity, lower transaction costs, price efficiency, and discovery.
Key technical success metrics of HFT platforms: Latency, Jitter
HFT firms operating in digital asset markets strive for low double-digit microsecond tick-to-trade performance to remain competitive.
Making a market
an HFT firm is obliged to quote prices to the market on both sides of the book
Bid (MM is buying, counterparty is selling)
Ask (MM is selling, counterparty is buying)
Core function of an MM strategy:
Capture the spread (the difference between bid and ask prices)
Manage risk and provide the best (fastest) execution
Provides liquidity but also carries temporary inventory risk:
MMs need to execute an extremely high volume of orders to provide liquidity to the market and to profit from the spread.
HFT firms capitalize on temporary price differences across multiple exchange venues where digital assets trade.
Success requires optimized execution and microsecond-level speed, as delays can eliminate arbitrage opportunities.
Importance of Jitter in HFT
Predictable Execution:
- HFT algorithms rely on consistent and predictable performance to execute trades at precise moments and prices. Unpredictable latency (high jitter) can cause trades to execute later than intended, missing the optimal price point and potentially leading to losses.
Arbitrage Opportunities:
- Jitter can cause delays in price discovery across different exchanges, eliminating fleeting arbitrage opportunities that exist for only microseconds.
Risk Management:
- Consistent latency allows for better real-time risk management and monitoring. High jitter introduces variability, making it harder to predict system behavior under stress and manage risk effectively.
Algorithmic Integrity:
- An algorithm designed to react within a specific timeframe can be disrupted by unexpected delays, potentially leading to unintended trading outcomes.
Techniques to Minimize Jitter
Hardware Optimization:
Using deterministic hardware, such as Field-Programmable Gate Arrays (FPGAs), for critical processing paths.
Employing specialized network interface cards (NICs) designed for low-jitter performance.
Operating System (OS) and Kernel Tuning:
Using real-time operating systems (RTOS) or custom-tuned Linux kernels (e.g., PREEMPT-RT patches) to prioritize trading processes and minimize non-deterministic interruptions.
Implementing techniques like thread pinning to dedicated CPU cores and minimizing context switching to avoid CPU cache misses and resource contention.
Network Architecture:
Utilizing kernel bypass techniques to move data directly from the network hardware to user-space applications, avoiding the general-purpose, non-deterministic Linux network stack.
Placing servers in exchange co-location facilities to minimize physical distance and network path variability.
Software Design:
Using efficient data structures and design patterns (like the disruptor pattern) to optimize message flow and reduce unnecessary memory allocations.
Optimizing serialization/deserialization processes to handle high data volumes efficiently.
LMAX exchange Disruptor Patter:
High-performance, low-latency inter-thread messaging system.
Utilizes a Ring Buffer for efficient event/task passing.
Common in high-frequency trading (HFT).
Core Concepts & How it Works:
Ring Buffer:
Circular, fixed-size array in memory.
Holds events/messages.
Producers:
Claim slots in the buffer using sequence numbers.
Write data and commit.
Consumers (Batch Handlers):
Process events from the buffer.
Operate in a dependency graph.
Notify each other of new data.
Mechanical Sympathy:
Exploits CPU caches (L1, L2, L3).
Keeps data contiguous and predictable.
Reduces costly memory fetches.
No Locks/Contention:
Uses sequence numbers and barriers instead of locks.
Prevents threads from blocking each other.
False Sharing Avoidance:
Pads data to ensure variables don't share cache lines.
Avoids unnecessary cache invalidations.
Optimizations:
CPU Cache Usage:
Exploits CPU caches (L1, L2, L3).
Keeps data contiguous and predictable.
Avoiding Locks:
Uses sequence numbers and barriers instead of locks.
Prevents threads from blocking each other.
False Sharing Avoidance:
- Pads data to ensure variables don't share cache lines.
Use in Trading:
Speed:
- Essential for HFT where microsecond delays matter.
Throughput:
- Handles massive volumes of market data and order flow.
Scalability:
- Allows complex processing pipelines to scale across multiple CPU cores.
Task-Focused Pipelines:
- Ideal for breaking down order processing into sequential, parallelizable tasks.
Benefits:
Low Latency:
- Achieves nanosecond-level processing between stages.
High Throughput:
- Processes significantly more events than traditional queue-based systems.
Predictable Performance:
- Avoids unpredictability of locks and garbage collection pauses.
Field-Programmable Gate Array (FPGA)
is a flexible integrated circuit (IC).
Programmable after manufacturing.
Allows creation of custom digital circuits.
Offers hardware-level customization.
Suitable for high-performance computing, signal processing, and rapid prototyping.
Unlike fixed-function chips.
Uses reconfigurable logic blocks, interconnects, and memory.
Implements various functions in parallel.
Enables hardware acceleration and low-latency performance.
Ideal for telecoms, automotive (ADAS), aerospace, and data centers.
Hardware Description Languages (HDLs):
Engineers use Verilog or VHDL.
Describe desired digital circuit.
Logic Synthesis & Place & Route:
Software translates HDL code.
Creates a configuration file (bitstream).
Maps design onto FPGA's internal resources.
Programmable Resources:
The chip contains Lookup Tables (LUTs), Flip-Flops, Multiplexers.
Programmable interconnects.
Allows formation of any digital circuit.
Parallel Processing:
Operations occur simultaneously across hardware.
Provides high throughput and low latency.
Ideal for complex tasks.
Fair and Equal Access in Digital Asset Markets:
Ensures all participants have the same opportunity to trade and access information.
In mature markets (equities and derivatives), regulatory oversight enforces strict disclosure and equal access requirements.
Technical Implementation:
Achieved by minimizing latency disparities.
Ensures all participants experience similar response times within defined tolerances (microseconds to milliseconds).
Enables reliable participation in price discovery, liquidity provision, and trade execution.
AWS Contributions:
Many venues are built on cloud platforms for scalability, not precise performance tuning.
Continuously improving the platform to support exchanges and trading firms.
Enhances capabilities for greater fairness and equality.
Architecture:
Centralized digital asset exchanges optimize infrastructure on AWS for low-latency performance.
Enables market makers (MMs) to execute thousands of trades per second.
AWS provides optimizations on compute placement and network topologies to reduce latency and jitter.
Reference architecture depicts a typical CEX transaction processing hot path for latency-optimized access with an MM.
Trade Receive and Processing Flow:
Order Submission:
- Order is submitted to the Exchange Gateway.
Order Forwarding:
- Order is forwarded to the Fair Order Sequencer.
Batching and Time-Stamping:
- Sequencer applies batching and sends time-stamped orders to the matching engine.
Execution Acknowledgement:
- Matching engine sends execution acknowledgements.
Market Data Updates:
Market data updates are sent to the distribution server.
Updates are sent to the market data gateway.
HFT Feed Handler:
Market data is delivered to the HFT feed handler.
Feed handler sends market data to the trading strategy engine.
Trading Strategy Engine:
Generates a signal to provide or take liquidity.
Signal is sent to the order management system (OMS).
Order Management System (OMS):
OMS sends orders and receives execution confirmations.
Execution notifications are delivered to the OMS.
Data Replication:
Data is replicated to a secondary Availability Zone or AWS Region.
Ensures continuous operation if the primary Availability Zone is impaired.
High Availability and Disaster Recovery:
Approaches vary by exchange venues.
Range from synchronous failover to partial failover of order layers only for position management and reconciliation.
Latency Optimization Approach:
- Optimize across these layers to ensure the lowest possible end-to-end latency for trading and real-time workloads.
Key Latencies:
Order-to-Ack Latency: representing the entire flow of an order transaction.
Tick-to-Order Latency: measuring the speed of consuming price changes, converting to a trading signal, and modifying positions.
Success Metrics:
Latency and jitter for price discovery, strategy, and order execution.
Target: Low double-digit microsecond tick-to-trade performance.
Optimization Categories:
Network:
Latency from the underlying network.
Includes AWS routing, instance placement, and connectivity choices.
Compute:
Latency from Amazon EC2 instance types.
Includes instance selection, kernel and OS tuning, and optimizing Elastic Network Adapter (ENA) performance.
Application:
Latency from business logic and user-space processing.
Focuses on efficient traffic handling, CPU/memory usage, and minimizing resource contention.
System:
Latency from system-wide processes.
Includes data distribution, timing services, and overall architecture efficiency.
CEX and MM Hot Paths:
CEX Hot Path:
Focuses on order entry, execution onto the order book, and distribution of market data.
Includes order entry, balance checks, matching, acknowledgements, and publication of market events.
Goal: Provide best execution and pricing experience to attract liquidity.
MM Hot Path:
Focuses on tick-to-trade latency.
Involves receiving market data, evaluating for trading opportunities, and placing trades.
Goal: Maintain the most recent view of the market and avoid risk and opportunity cost.
Market Maker "Hot Paths":
Latency-critical data and execution pathways within technology systems.
Optimized for speed and efficiency to maintain competitive edge and manage risk.
Key Components:
Market Data Ingestion:
Rapidly receiving and processing external market data.
Determines current fair price of an asset.
Pricing and Opportunity Evaluation:
Calculates optimal bid and ask prices and determines volumes.
Manages inventory risk and minimizes directional exposure.
Order Entry and Execution:
- Low-latency systems for placing limit orders quickly.
Internal Communication and Balance Checks:
Swift communication between internal systems.
Ensures order validity and firm's exposure within acceptable limits.
Hedging:
Quickly executes offsetting trades to hedge positions.
Manages inventory acquired from customer trades.
Market Event Distribution:
Promptly publishes internal market events and acknowledgments.
Distributes to relevant internal systems and external clients.
Optimization Techniques:
Cluster Placement Groups (CPGs):
Used to reduce physical network distance between components.
Place CEX and MM workloads in CPGs to localize instance placement on the same network spine in an Availability Zone.
Benefits:
- Average 37% reduction in P50 and 39% reduction in P90 UDP roundtrip time latencies compared to instances outside CPGs.
Latency Sources:
- Extends from the network into the instance, through the network card, kernel, operating system, and application layers.
Latency Optimization Trade-offs:
-
Using Cluster Placement Groups (CPGs):
- Minimizes network latencies.
- Not compatible with fully resilient Multi-AZ architectures.
Matching Engines as State Machines:
Typically modeled as deterministic state machines.
The same sequence of input events results in the same predictable state.
Critical for consistency in CEX order books and financial systems.
Core execution is single-threaded to avoid non-determinism and ensure linear processing.
Distributed State Machines for High Availability and Scalability:
Implemented across several instances for high availability and scalability.
Replicate logic across multiple nodes to maintain operation even if some nodes fail.
Horizontal scaling by distributing workloads across multiple replicas.
Consistency maintained through robust consensus protocols (e.g., Raft).
Messaging, Consistency, and Low Latency:
Consensus protocols combined with messaging protocols for minimal latency and consistency.
Customers may design custom messaging layers or use mature offerings like Aeron or Chronicle.
Messaging layers optimized for low-latency, high-throughput communication.
Enables deterministic replication of state machine inputs with minimal latency overhead.
Aeron: Ultra-Low-Latency Messaging Middleware
Ultra-Low Latency
Microsecond-Range Latency: Designed to deliver latencies in the microsecond range, critical for high-frequency trading.
Predictable Performance: Ensures consistent performance under varying loads.
High Throughput
Millions of Messages per Second: Capable of handling a high volume of messages reliably.
Scalable Architecture: Supports scaling to meet the demands of large-scale trading systems.
Reliability and Resilience
24/7 Availability: Built for continuous operation with features ensuring high availability.
Fault Tolerance: Includes mechanisms for automatic failover and recovery.
Cost-Effectiveness
Open-Source: Free to use, reducing licensing costs.
Efficient Resource Utilization: Minimizes hardware and operational costs.
Key Features and Components
Aeron Transport:
Core Functionality: High-performance, low-latency message transport.
Protocols: Utilizes UDP (unicast and multicast) and Inter-Process Communication (IPC).
Supported APIs: Java, C/C++, and .NET.
Aeron Archive:
Purpose: Extension for message recording and replay.
Capabilities: Persists streams to disk at full message rates, ensuring data recovery and seamless service reconnection without message loss.
Aeron Cluster:
Framework: For building fault-tolerant, distributed services.
Consensus Algorithm: Uses the Raft algorithm.
Features: Automatic failover and strong data consistency, ideal for exchanges and transactional workflows.
Agrona & Simple Binary Encoding (SBE):
Agrona: Offers high-performance data structures.
SBE: Provides a compact, low-overhead binary message format to minimize message size and CPU cycles.
Benefits for Trading Firms
Ultra-Low Latency:
Performance: Delivers sub-100 microsecond latency in cloud environments and below 20 microseconds on physical hardware.
Critical for: High-frequency trading.
High Throughput:
- Capacity: Capable of handling millions of messages per second reliably.
Reliability and Resilience:
Availability: Ensures 24/7 availability.
Features: Automatic failover, instant recovery, and robust data loss handling mechanisms.
Cost-Effectiveness:
- Reduced Costs: Lowers hardware and operational costs, avoiding vendor lock-in.
Proven Adoption:
Users: Major financial institutions and platforms globally, including Coinbase, Man Group, and SIX Interbank Clearing.
Applications: Powers real-time trading and payment systems.
Main Components of the Aeron System:
Media Driver:
Function: Core engine managing buffer handling, network I/O, and data transfer.
Deployment: Can run as an embedded thread within an application or as a separate process.
Benefit: Minimizes impact of events like garbage collection pauses on latency.
Aeron Client:
Interface: Application-facing API for developers.
Operations: Clients create Publications (for sending messages) and Subscriptions (for receiving messages).
Channels and Streams:
Channel: Defines the communication path.
Stream ID: Integer identifier used to multiplex different logical message flows over the same channel.
Log Buffers:
Storage: Messages are written into a sequence of memory-mapped files called log buffers.
Design: Allows for direct memory access (off-heap), bypassing JVM garbage collection for predictable, low-latency access.
Real world use case
Coinbase: Built its highly resilient, cloud-native trading infrastructure using Aeron Cluster to process high transaction volumes with low latency.
EDX Markets (EDXM): Leveraged Aeron Cluster for fault tolerance and ultra-low latency within its trading platform.
IMMIX: Adopted Aeron Cluster to construct a high-performance digital asset trading system and sequencer.
HSBC: Integrated Aeron to enhance its electronic trading systems.
Man Group: Utilizes Aeron in its low-latency foreign exchange trading system for rapid quote stream processing.
SIX Interbank Clearing Ltd: Deployed Aeron clusters for its Swiss instant payment platform.
BLOX Markets: An upcoming U.S. stock trading platform focused on retail investors is integrating Aeron systems.
Chicago Mercantile Exchange (CME): Sponsored the Aeron open-source development project for internal messaging.
Kepler Cheverton (Kcx) partnered with Adaptive to build an event-driven stock trading platform based on Aeron.
Brokerage services leader DriveWealth adopted Aeron to rearchitect its trading infrastructure, achieving exchange-grade performance.
Talos leveraged the Aeron transport protocol within its digital asset ecosystem platform for high-performance trading and message serialization processing.
Message Flow and Mechanics
Publication:
Method: Publisher client uses the offer() method to write a message into a log buffer in shared memory.
Operation: Non-blocking; returns a status code indicating success or if backpressure is applied.
Media Driver Action:
- Function: Continuously reads from the publisher's log buffer and transfers data to the appropriate destination(s) via the configured channel (UDP or IPC).
Subscription and Polling:
Method: Subscriber client calls the poll() method to read messages from its associated log buffer.
Model: Polling model gives the application control over when it processes data, maintaining predictable latency by avoiding interrupt-driven callbacks.
Reliability:
Layer: Aeron implements its own reliability layer over UDP.
Mechanism: Detects message loss and uses Negative Acknowledgements (NAKs) to request retransmission of missing fragments, ensuring reliable, in-order delivery with minimal overhead.
Lock-Free Design:
Technique: All internal operations are lock-free, using techniques like atomic tail updates in the log buffers.
Benefit: Avoids thread contention and maximizes throughput.
Atomic tail updates
Networking Protocols:
Atomic tail updates are crucial in high-performance networking and database systems.
They efficiently manage data structures like ring buffers or logs.
Atomically updating the tail pointer ensures concurrent interaction without data corruption.
System Upgrades:
In certain Linux distributions (atomic distros), atomic updates involve upgrading the entire OS image in one transaction.
The system only boots into the new image if the update is successful, ensuring integrity.
This is a system-wide update leveraging the principle of atomicity.
Raft Consensus Algorithm:
A distributed systems algorithm that ensures servers agree on data state, emphasizing understandability through leader election, log replication, and node roles (Follower, Candidate, Leader).
Leader Election: Nodes transition from Follower to Candidate if no Leader is detected, initiating an election to select a new Leader.
Log Replication: The Leader logs client requests and sends them to Followers. Once a majority confirms, the entry is committed, ensuring all nodes agree on operation sequences.
Node States:
Follower: Accepts commands and heartbeats from the Leader.
Candidate: Seeks leadership during elections.
Leader: Coordinates replication and client requests.
Simplified Workflow:
Start: All nodes begin as Followers with a randomized election timeout.
Election: A Follower times out, becomes a Candidate, increments the term, and requests votes.
Vote: Followers vote for the Candidate if they haven’t voted in the current term and the Candidate’s log is up-to-date.
Leader: A Candidate with a majority vote becomes Leader and sends heartbeats.
Replication: Clients send requests to the Leader, which logs and replicates them to Followers.
Commit: Once a majority of Followers log the entry, the Leader commits it and applies it to its state machine, instructing Followers to do the same.
Key Benefits:
Understandability: Easier to implement and grasp compared to older protocols like Paxos.
Strong Consistency: Ensures all nodes see operations in the same order (linearizability).
Fault Tolerance: Elects new leaders to handle server failures.
Applications:
Databases: CockroachDB, YugabyteDB.
Cluster Managers: HashiCorp Consul.
Message Queues: Apache Kafka with KRaft.
Multi-AZ Deployments for High Availability and Disaster Recovery:
Consensus and messaging solutions provide high availability within a single Availability Zone and disaster recovery across multiple Availability Zones or Regions.
Achieved through log replication to secondary targets or persistent storage layers.
Use storage services like Amazon S3, Amazon FSx, and distributed databases like Amazon Aurora and Amazon Aurora DSQL.
Raft log replication modes vary: fully synchronous, near-synchronous, or asynchronous.
Replicated data useful for audit and compliance due to containing the entire history of system interactions.
CPGs and Amazon EC2 Capacity Fulfilment Considerations:
Using CPGs to optimize placement and colocate EC2 instances.
Manage trade-offs with capacity fulfilment.
CPGs create an effective deployment boundary beneath a single Availability Zone network spine, reducing the available pool of Amazon EC2 capacity.
Impact varies by the size of Availability Zone: mitigated in large Regions and Availability Zones, heightened in smaller ones.
Manage risks by reserving Amazon EC2 capacity using On-Demand Capacity Reservations.
Benefits include reduced latency and improved capacity assurance with shared CPGs.
CEX and MM Network Boundary Latency:
Critical for HFTs seeking low-latency colocation with digital asset exchanges.
Customers can’t control these boundaries; latency and jitter depend on placement and network design.
CPGs for Latency Optimization:
Cluster Placement Groups (CPGs) are fundamental for latency-optimized placement.
Shared CPGs extend colocation across different AWS accounts.
Aligns with cloud colocation principles within a Region.
Connectivity Patterns:
- Digital asset exchanges offer various connectivity patterns with latency characteristics ranging from 50–200 microseconds to over a millisecond.
Optimal Connectivity:
HFT customers aim for the lowest-latency path to exchanges, avoiding CDNs and load balancers.
Ideally interface with exchange endpoints directly on EC2 order gateways using public or private IPs across a VPC peering connection.
Optimize for protocol choice, favoring FIX over REST or WebSockets to minimize protocol-induced latency.
Latency Monitoring:
HFTs monitor latency to CEX endpoints using layered testing:
Basic HTTP/TCP pings.
End-to-end latency.
Application-level monitoring for market events and order execution.
Continuously optimize placement, balancing latency, availability, and instance selection due to potential movement of CEX endpoints and HFT instances within an Availability Zone.
The Financial Information eXchange (FIX) Protocol
Core Concepts and Mechanics
Structured Messages: FIX protocol uses a tag-value pair format for structured messages.
Tag-Value Pairs: Each data field is identified by a unique integer tag paired with a value.
Key Operational Layers:
Session Layer: Manages the connection between two parties, ensuring reliable message delivery, sequencing, and recovery from disruptions using mechanisms like heartbeats and gap fills.
Application Layer: Defines the content of business-related messages such as new orders, execution reports, order cancellations, and trade allocations.
Widespread Adoption and Use Cases
Origin: Originally developed in 1992 for U.S. equity trading between Salomon Brothers and Fidelity Investments.
Asset Classes Supported: Expanded to support virtually all asset classes, including fixed income, foreign exchange (FX), derivatives, commodities, and digital assets/cryptocurrencies.
Primary Use Cases:
Order Routing and Execution: Seamless transmission of orders and execution reports between market participants.
Market Data Dissemination: Distributing real-time price quotes, trade volumes, and market depth information.
Post-Trade Processing: Handling trade allocations, confirmations, and settlement details, streamlining back-office operations and regulatory reporting.
Algorithmic and High-Frequency Trading (HFT): Structured, low-latency nature makes it well-suited for automated trading strategies, with optimized binary encodings like Simple Binary Encoding (SBE) and FIX Adapted for STreaming (FAST) addressing extreme performance needs.
How the FIX Protocol Works
- Message-Based Standard: Structured around a series of tag-value pairs for interoperability and efficiency.
Message Structure: Tag-Value Pairs:
Integer Tag: Every piece of data within a FIX message is represented by an integer tag.
Equals Sign: Followed by an equals sign.
Data Value: Then the data value (e.g., 35=D for New Order Single).
Components of a FIX Message:
Header: Contains session-level fields like BeginString, MsgType, SenderCompID, and TargetCompID.
Body: Contains core application-level business data like Symbol, OrderQty, Price, and Side.
Trailer: Contains the CheckSum to verify message integrity.
Operational Layers
The Session Layer:
- Manages the reliable and continuous connection between two counterparties.
Key mechanisms:
Heartbeats: Periodical messages to confirm the connection is alive.
Sequence Numbers: Unique, incrementing numbers for each message to detect missing messages.
Resend Request: Receiver can ask for "gap fill" of missing messages if a sequence gap is detected.
The Application Layer:
Defines the actual business content of the messages.
Dictates valid order types, execution reports, trade confirmations, and market data requests.
Example: New Order - Single (35=D)
Buy order for 100 shares of Apple (AAPL) at $150.00.
8=FIX.4.2|9=100|35=D|34=10|49=BUYER|56=SELLER|52=20251211-10:00:00.000|11=ORD1001|21=1|38=100|40=2|54=1|55=AAPL|44=150.00|10=000|
8=FIX.4.2: FIX Version (e.g., 4.2).
9=100: Body Length (number of characters in the message body).
35=D: Message Type (D = New Order - Single).
34=10: Sequence Number (message number 10 in the session).
49=BUYER: Sender ID (Your firm).
56=SELLER: Target ID (Broker/Exchange).
52=20251211-10:00:00.000: Timestamp.
11=ORD1001: ClOrdID (Client Order ID).
21=1: HandlInst (1 = Automated execution, private, no broker intervention).
38=100: OrderQty (Quantity: 100 shares).
40=2: OrdType (2 = Limit Order).
54=1: Side (1 = Buy).
55=AAPL: Symbol (Apple Inc.).
44=150.00: Price (Limit price: $150.00).
10=000: Checksum (for message integrity).
Simple Binary Encoding (SBE)
A specialized, open-standard binary protocol used heavily in electronic trading and high-frequency trading (HFT) environments.
Achieve extremely high processing speeds and predictable (deterministic) latency for mission-critical trade messages.
Why SBE is Used for Trade
In trading, every microsecond counts.
Traditional data formats like XML or JSON are too slow and bulky.
SBE was developed by the FIX Trading Community to address the need for a protocol that maximizes CPU efficiency and minimizes latency during the encoding and decoding process.
SBE optimizes the trade message path from the moment an order is created to when it hits the exchange.
Key Principles
Fixed-Offset Binary Layout
Core concept of SBE.
Assigns a precise memory location (offset) to every single field in a trade message.
Impact on Trading: Direct memory access avoids complex parsing logic, reducing processing time and making latency highly predictable.
CPU Optimization (Cache Friendly)
Designed to be "cache-friendly" for modern computer processors.
Impact on Trading: Fixed-offset design minimizes conditional branching, allowing CPU's internal prediction mechanisms to work effectively.
Simplicity and Code Generation
Uses a structured, machine-readable XML schema to define message layouts.
Tools automatically generate the necessary code (in Java, C++, C#, etc.) to read and write these specific trade messages.
Impact on Trading: Code generation removes human error and ensures both sender and receiver interpret the binary data in the exact same, highly efficient manner.
SBE vs. FAST for Trade
SBE (Order Entry)
Often preferred for sending actual orders to an exchange.
Stateless nature and deterministic latency make it ideal for critical, point-to-point communication of order submission.
FAST (Market Data)
Primarily used for receiving market data (prices, quotes).
Strength is massive compression of repetitive data streams (delta encoding), which is perfect for broadcasting millions of price updates efficiently.
FIX Adapted for STreaming (FAST)
- is used in trade environments primarily for market data distribution, not typically for sending trade orders (order entry).
Purpose and Use Case
FAST was designed by the FIX Trading Community specifically to optimize the transport of high-volume, real-time market data feeds from exchanges to trading firms.
High Volume Data: Exchanges generate millions of price updates, quotes, and market depth messages every second. Sending these in the traditional text-based FIX format consumes too much bandwidth and causes latency.
Compression Efficiency: FAST acts as a powerful compression algorithm that significantly reduces the size of these messages by eliminating repetitive data and using a template-based system.
Distribution (Fan-out): It is highly optimized for one-to-many communication (one exchange sending data to many subscribers). This makes it ideal for the "market data" side of trading operations.
How FAST Works for Trade Data
FAST uses a "stateful" approach where the receiver remembers values from previous messages within a session.
Analogy: If the order book updates, and only the price of a specific stock changes while the currency, exchange, and time remain the same, FAST only sends the new price value.
Benefit: This drastically reduces bandwidth usage and overall network latency, ensuring that traders receive price updates as quickly as possible to make informed decisions.
FAST vs. Order Entry (Why SBE is Different)
While FAST optimizes data reception, it is generally not the primary protocol for sending trade orders (order entry).
Order Entry Requires Predictability: Sending a trade order is a point-to-point, highly critical interaction where deterministic latency is paramount. The receiving exchange needs every order to be complete and instantly parsable, without relying on the state of a previous message.
SBE is Preferred for Orders: Simple Binary Encoding (SBE) is often preferred for order entry because it uses a fixed-offset binary layout that allows for immediate, non-conditional parsing, providing more predictable ultra-low latency compared to FAST's stateful compression.
Market Data and Multicast:
CEX Matching Engine:
- Continuously updates the order book, generating a dynamic view of market activity.
Market Data Transformation:
Raw order book state is transformed into market data through aggregation and normalization.
Exchange extracts key metrics like best bid and ask prices at various levels (Level 1, Level 2, Level 3).
Level 1 Data: Provides real-time information on the best bid (highest price a buyer is willing to pay) and best ask/offer (lowest price a seller is willing to accept). It also typically includes the last traded price and volume.
Level 2 Data: Offers a deeper view of the market by displaying multiple bid and ask prices at various price levels (the order book). This depth of market (DOM) information helps traders gauge overall supply and demand.
Level 3 Data: Typically reserved for market makers and exchange members. It includes all the information from Level 1 and Level 2, plus the ability to interact with the order book and execute trades directly.
Market Orders:
- Market data is generated for market orders, which are immediately matched to the best available bid or ask price.
Multicast on AWS Transit Gateway:
Required to replicate and simulate multicast delivery within or between VPCs.
Smaller CEXs may use shared transit gateways for market data distribution to MMs.
Transit Gateway is convenient but not designed for high-frequency, low-latency trading and has scaling limits for larger CEXs.
Market Data Distribution:
Traditional Markets:
- Use UDP multicast for distributing market data to HFT MMs via optimized, physically bounded, colocation networks.
Cloud-hosted CEXs:
- Predominantly use TCP unicast WebSocket APIs for real-time market data distribution.
REST APIs:
Used for on-demand or periodic data retrieval (e.g., historical trades, candlestick data).
Suitable for non-real-time applications like portfolio trackers.
Polling REST endpoints introduces higher latency and stricter rate limits.
FIX Gateways:
Some CEXs offer real-time market data through FIX gateways for institutional MMs.
Provides message standardization and lower-latency access.
FIX endpoints are often hosted on dedicated cloud infrastructure for specific institutional MMs, hedge funds, and prop trading firms.
Precision Time and Fair and Equal Order Processing:
Importance of Fair and Equal Access:
Crucial for digital asset exchanges, extending beyond network infrastructure to application components.
Regulatory frameworks like MiFID II enforce fairness in traditional markets, and digital asset exchanges follow similar principles.
Fair Order and Market Event Sequencing:
Critical before and after trade matching.
Accurate timing is core to fair order sequencing, allowing CEXs to timestamp messages across distributed components.
Precision Time for HFT MMs:
Enables more accurate processing of events and generation of signals.
Informs strategy execution and risk control.
Historical Cloud Timing Services:
- Achieved accuracy to hundreds of microseconds or milliseconds, insufficient for HFT strategies requiring low-digit microsecond accuracy.
MiFID II (Markets in Financial Instruments Directive II)
is an EU regulation effective 2018.
Aims to create fairer, more transparent, and efficient financial markets.
Boosts investor protection and competition across Europe.
Covers more instruments and tightens rules on conduct, transparency, and reporting.
Key aspects:
Detailed cost/charge disclosures to clients
Stricter suitability assessments including sustainability preferences
Best execution obligations for client trades
Enhanced transaction reporting to regulators for market surveillance
Rules for new trading venues like OTFs (Organised Trading Facilities) and stricter rules for existing ones
Key Objectives
Investor Protection: Ensuring clients get the best result (best execution) and products suitable for them, with clear cost/risk info.
Transparency: Mandating detailed reporting on trading activity (prices, volumes) for regulators (ESMA) and clients.
Market Efficiency: Fostering competition, reducing opaque trading (like dark pools), and creating harmonized EU-wide rules.
Core Requirements & Changes
Costs & Charges: Firms must disclose all costs (fees, commissions) upfront to clients.
Best Execution: Firms must take all steps to get the best possible outcome for clients' trades (price, speed, cost).
Product Governance: Rules for manufacturers and distributors to identify the "target market" for each product.
Trading Venues: Creation of new venues (OTFs) and stricter rules for existing ones.
Data Reporting: Increased transaction reporting to regulators (ESMA) for market abuse surveillance.
Who it Applies To
Investment firms
Banks
Asset managers
Trading venues (including new OTFs)
Manufacturers and distributors of financial products
Organised Trading Facility (OTF)
is a European financial venue for bonds, structured products, emission allowances, and derivatives.
Acts as a multilateral system where buying/selling interests meet to form contracts.
OTF operators have discretion in execution, allowing for matched principal trading.
Creates transparency for non-liquid instruments under MiFID II rules.
Key Characteristics:
Product Focus: Primarily for non-equity instruments like bonds, structured finance products, emission allowances, and derivatives (MiFID II instruments).
Multilateral System: Brings together multiple third-party buying and selling interests, similar to other trading venues.
Discretionary Execution: The operator of the OTF has discretion in deciding how to execute trades, unlike the non-discretionary rules of Multilateral Trading Facilities (MTFs).
Matched Principal Trading: Operators can act as principals to match buy and sell orders, but only under strict conditions, especially for less liquid sovereign debt.
Regulatory Framework: Created under MiFID II/MiFIR, requiring authorization and subject to strict market abuse rules and transparency requirements.
How it Differs from MTFs (Multilateral Trading Facilities):
Discretion: OTFs allow operator discretion; MTFs are non-discretionary.
Products: OTFs focus on fixed income and derivatives; MTFs can trade equities and other instruments.
Principal Trading: OTFs permit limited matched principal trading, which is generally banned on MTFs.
Multilateral Trading Facilities (MTFs)
are regulated electronic platforms offering alternatives to traditional stock exchanges.
Connect buyers and sellers under rules like Europe's MiFID II.
Facilitate trading in various assets (stocks, bonds, derivatives), often for less liquid or over-the-counter (OTC) products.
Provide efficient, non-discretionary trading for market operators and banks.
Contrast with traditional exchanges by having less stringent vetting for instruments.
Key Characteristics of MTFs:
Electronic & Multilateral: Bring together multiple third-party buying and selling interests in financial instruments.
Alternative Venues: Serve as non-exchange venues, often for exotic or OTC products, increasing market liquidity.
Operator-Run: Operated by market operators or investment firms, not necessarily stock exchanges themselves.
MiFID Framework: Regulated under European directives like MiFID II, ensuring transparency and fairness.
Asset Classes: Can trade equities, bonds, derivatives, but have specific rules (e.g., no proprietary trading).
How They Differ from Regulated Markets:
Listing Process: Instruments on MTFs don't necessarily go through the extensive vetting and ongoing obligations required for traditional exchanges (Regulated Markets).
Discretion: MTFs operate under non-discretionary rules for trade matching, unlike Organised Trading Facilities (OTFs) which allow discretion for non-equities.
Examples:
Liquidnet Europe
Currenex MTF
UBS MTF
Currenex MTF
Currenex MTF (Multilateral Trading Facility) is State Street's electronic platform for institutional FX trading.
Offers tight spreads and deep liquidity from over 60 banks.
Provides anonymous all-to-all ECN access.
Allows professionals to execute trades with diverse order types, algorithms, and low-latency API connectivity.
Operates as a pure ECN (No Dealing Desk) system.
Focuses on efficient, cost-effective FX and metals trading for large players.
Liquidnet
is a technology-driven agency execution specialist and a subsidiary of the TP ICAP Group.
Connects institutional investors to a large, global pool of liquidity, primarily for block trades in equities, fixed income, and derivatives.
Core Services & Features:
Block Trading: Facilitates large-scale, anonymous trades between institutional investors, minimizing market impact costs and protecting trading information.
Global Network: Links over 1,000 asset management firms across 57 markets on six continents, managing trillions in assets collectively.
Technology & Platforms:
Dark Pools: Operates as an Alternative Trading System (ATS) or Multilateral Trading Facility (MTF), using private forums for trading.
Algorithmic Trading: Offers advanced algorithms like "Barracuda" and "SmartDark" to seek liquidity across internal and external venues while minimizing market footprint.
OMS/EMS Integration: Designed to integrate seamlessly with members' existing Order Management Systems (OMS) and Execution Management Systems (EMS).
Asset Classes:
Fixed Income: Solutions for secondary market trading and innovative electronic workflows for primary bond markets, including connectivity with syndicate banks.
Listed Derivatives: Expansion into US Equity Options to strengthen multi-asset capabilities.
Non-discretionary rules
are guidelines or requirements that leave no room for personal judgment or choice, demanding a specific action or outcome.
Common in finance (requiring client approval for trades), law (mandatory spending/actions), and trading systems (fixed buy/sell triggers like a 5/20 crossover).
In investments, it means a broker must get client consent for every trade, contrasting with discretionary accounts where the advisor decides.
non-discretionary means following a strict protocol rather than exercising personal discretion.
Key Aspects:
Mandatory Compliance: Rules must be followed precisely, as seen in non-discretionary spending for necessities like food or housing.
Clear Triggers: Often involves objective criteria, such as a trading system's algorithm (e.g., 5-day average crossing 20-day average).
Client Control (Investing): In finance, the client retains the "last word," approving every trade, with the broker acting as an order-taker.
Contrast with Discretionary Rules:
Non-Discretionary: Strict adherence to predetermined rules or client instructions (e.g., "Buy 100 shares of XYZ when it hits $50").
Discretionary: Allows an advisor to use their judgment to make decisions (e.g., "Manage my portfolio for growth").
No Dealing Desk (NDD) system in forex trading
Client orders are passed directly to external liquidity providers without any in-house intervention or a "dealing desk" acting as a counterparty.
Ensures transparency and eliminates potential conflicts of interest between the broker and the trader.
How NDD Systems Work:
- NDD brokers serve as a link, using automated systems to find the best available prices from a network of liquidity providers.
Operates primarily through two mechanisms:
Straight-Through Processing (STP): Client orders are routed directly to liquidity providers, and the broker adds a small, fixed markup to the spread as their commission.
Electronic Communication Network (ECN): Traders' orders are matched directly with other participants in a decentralized marketplace (including banks, other brokers, and individual traders). ECN brokers typically charge a fixed commission per trade while providing raw, variable spreads from the market.
Advantages and Disadvantages:
Basis: No Dealing Desk (NDD)
Broker's Role: Facilitates trades without taking an opposing stance.
Conflict of Interest: No potential conflict of interest as the broker profits from volume/commissions, not client losses.
Order Execution: Automated, instant execution at real market prices, with no requotes.
Pricing: Variable spreads based on actual market conditions.
Transparency: High transparency due to direct market access and real-time pricing.
Potential Drawbacks: Spreads can widen during high volatility; may involve commissions or slightly higher costs overall; risk of slippage is possible.
Products and Brokers:
- Traders often use specific platforms for NDD trading, with several reputable brokers offering this system.
NDD Brokers:
FP Markets: Praised for consistently low RAW spreads (often zero pips) and fast execution speeds.
Pepperstone: Known for excellent MT4 trading tools and competitive spreads on its Razor account.
IC Markets: Stands out for having some of the lowest average RAW spreads available (as low as 0.02 pips on EUR/USD).
FxPro: Offers the cTrader platform with RAW spreads, known for its clean interface and advanced charting tools.
Liquidnet Barracuda Algorithm
Liquidity-seeking trading algorithm
Combines dark aggregation with lit trading
Aims for high participation rates and minimal market impact
How the Barracuda Algorithm Works
- Utilizes opportunistic logic to capture liquidity
Dark Aggregation
Core component: dark aggregator
Seeks large, anonymous liquidity blocks in dark pools
Executes trades without revealing order size or intent
Reduces price impact
Lit Trading
Complements dark aggregation
Trades on public stock exchanges when advantageous
Employs opportunistic logic for best execution opportunities
Results and Benefits
High Participation Rate
- Achieves significant participation in market volume (e.g., 65.5% of lit market volumes)
Market Impact Savings
Saves on market impact costs
Reported savings of 10.0 bps from block executions
Performance Metrics
Outperforms 10% POV (Percentage of Volume trading) transaction cost model by 17.4 bps
Outperforms Interval VWAP (Volume Weighted Average Price) VWAP by 4.8 bps
Conclusion
Efficiently executes large orders
Minimizes disruption to market price
Liquidnet SmartDark Algorithm
Designed for institutional traders
Executes large equity orders in dark pools efficiently
Minimizes market impact and enhances price stability
How the SmartDark Algorithm Works
Operates within the Liquidnet Dark trading environment
Finds large, non-displayed blocks of liquidity
Prioritized Routing
Uses a prioritized routing system
Evaluates "yield and quality metrics"
Intelligently decides where to send parts of an order
Favors venues with better execution sizes and stable prices
Liquidity Seeking Strategy
Actively seeks out and accesses a broad set of dark venues
Utilizes internal block crossing opportunities
Avoids passive waiting for a match
Maximizing Exposure
Combines targeted strategy with maximum liquidity exposure
Allows traders to access high-quality venues
Minimizing Market Impact
Executes large trades anonymously in dark pools
Prevents wider public market from seeing large order size
Avoids significant price movements that occur on lit exchanges
5/20 EMA Crossover Strategy
Uses 5-period EMA crossing 20-period EMA to spot short-term trend changes
Bullish signal (buy) when 5 EMA crosses above 20 EMA
Bearish signal (sell) when 5 EMA crosses below 20 EMA
Ideal for trending, fast-moving markets
Requires confirmation (e.g., RSI, volume) to avoid false signals
How it Works
5 EMA (Fast): Reacts quickly to recent price changes
20 EMA (Slow): Provides a broader, smoother view of the short-term trend
Entry Signals
Bullish (Buy): 5 EMA crosses above 20 EMA
Bearish (Sell): 5 EMA crosses below 20 EMA
Implementation Steps
Identify the Trend: Use a higher timeframe (daily/weekly) to see overall market direction
Plot EMAs: Add 5-period and 20-period EMAs to your chart
Wait for Crossover: Look for 5 EMA to cross 20 EMA on chosen timeframe
Confirm: Use other tools like RSI, volume, or candlestick patterns to validate the signal
Set Stops & Targets: Place stop-losses below swing lows (for longs) or above swing highs (for shorts); use risk/reward ratios (e.g., 1:2, 1:3) for profit-taking
Best For
Momentum Trading: Capturing quick, sharp moves
Trending Markets: Works best in markets with clear direction
Active Traders: Suited for intraday or swing trading on shorter timeframes (15min, 1hr, 5min)
Key Considerations
False Signals: Fast nature of 5 EMA creates many signals, many of which are fake (whipsaws)
Confirmation is Key: Never rely on the crossover alone; always seek confirmation from other indicators
Amazon Time Sync Service:
Improved in late 2023 to achieve sub-100 microsecond (often sub-50 microsecond) time accuracy using PTP and PTP hardware clock.
Supported by a dedicated timing network and GPS-disciplined clocks in every Availability Zone.
Hardware Packet Timestamping:
Introduced in June 2025, appending a 64-bit, nanosecond-precision timestamp to every inbound network packet at the hardware level.
Leverages AWS Nitro System’s reference clock and bypasses software-induced delays.
Provides nanosecond visibility on packet arrival at Nitro NIC
Enhanced Execution and Monitoring Capabilities:
- Improves execution and monitoring capabilities for digital asset exchanges and HFT MMs on AWS.
Greater Fairness and Accurate Measurement:
Allows CEXs to implement greater fairness.
Enables HFTs to accurately measure round-trip and one-way latencies.
Top comments (0)