Re:invent recap
David Sung
West Cheung
Digital Asset Market Makers Using AWS Macro Services for Agentic News Analysis
Introduction
- Disclaimer: Views expressed are personal and not official AWS or affiliated organization positions. Informational purpose only, not financial or professional advice.
Trading Ecosystem Overview
Participants: Buyers and sellers place orders on exchanges.
Market Liquidity: May lack sufficient liquidity.
Role of Market Makers: Ensure continuous buy and sell orders, providing liquidity and counterparties.
Order Book Abstraction
Bid Orders (Left): Buyers' orders.
Ask Orders (Right): Sellers' orders.
Y-Axis: Cumulative size at each price level.
Price Levels: Higher prices closer to the right-hand side.
Market Maker's Daily Work
Place multiple bid and ask orders.
Risk Management: News or announcements can cause significant market movements.
Inventory Management: Critical to avoid losses from sudden market shifts.
Ideal Market Conditions for Market Makers
Controlled Volatility: Small market swings.
Quick Execution: Bid and ask orders executed rapidly.
High Turnover Rate: Despite small profits per trade, high frequency of trades can lead to significant overall profit.
Market Maker Strategies and Volatility Factors
Spread and Profit Trade-Off
Increasing Spread: Market makers may consider widening the spread for higher profit per trade.
Trade-Off: Higher spreads may result in fewer executions, reducing overall profit.
High Frequency Trading: Market makers often opt for frequent trades with smaller profits to maximize turnover.
Volatility and Spread Adjustment
Volatility Impact: Volatility is a key factor in pricing; market makers adjust spreads based on expected volatility.
Event-Driven Volatility: Examples include FOMC meetings on Fed fund rates.
Rapid Analysis and Action: Market makers must quickly analyze news and adjust orders to avoid losses.
Examples of Market-Moving Events
FOMC Meetings: Predictable events with known release times and outcomes.
Social Media Influence: Unpredictable events like tweets from influential figures (e.g., Trump, Elon Musk) can cause immediate market reactions.
Dogecoin Example: Elon Musk's tweets about cryptocurrencies can lead to rapid price changes, illustrating the unpredictability and difficulty in interpreting social media-driven news.
Challenges in Digital Asset Market Sentiment Analysis
Influencer Impact in Digital Assets
Unpredictable Nature: Tweets from influencers like Elon Musk can rapidly impact cryptocurrency prices.
Interpretation Difficulty: Tweets are often subjective and require intelligent interpretation.
Speed of Analysis: Human judgment is too slow for real-time trading; automated solutions are necessary.
Evolution of Sentiment Analysis Approaches
- Dictionary Approach: Initial method using industry lexicons (e.g., bearish, bullish) and pattern matching.
Limitations:
- Limited vocabulary and poor context handling (e.g., misinterpreting "Massive Short liquidation event" as negative).
Statistical Matching and Machine Learning:
Transition to supervised learning with models like Naive Bayes or FinBERT.
Advantages: Better generalization and context understanding.
Challenges: Requires extensive labeled data, leading to high costs and longer time to market.
Need for Advanced Solutions
Real-Time Analysis: Necessity for rapid, automated sentiment analysis to keep up with market movements.
Context Awareness: Solutions must accurately interpret context to provide reliable sentiment analysis.
Advancements in Sentiment Analysis with Large Language Models (LLMs)
Transformer-Based Multimodal Reasoning
Modern LLMs like LLM, Claude, and DeepSeek enable context-aware sentiment analysis with minimal fine-tuning.
Capable of reasoning on specific news titles and domain-specific events (e.g., protocol exploits).
Inference Performance Optimization
- Critical for real-time sentiment analysis in dynamic markets with short time windows.
Journey of Optimization:
February 2025: Initial deployment using SageMaker Jumpstart on P5EN instances, achieving 80 output tokens per second.
April 2025: Switched to VLLM, enabling multi-token predictions, mixed precision, linear attention, and distributed parallelism, boosting performance to 140 output tokens per second.
August 2025: Replaced VLLM with SGLN, utilizing speculative decoding to achieve 180 output tokens per second.
Importance of Optimization
High Event Volume: With 10,000 events per minute, every millisecond of inference latency compounds.
Doubling Processing Capacity: Increasing output tokens per second from 80 to 180 doubles news ingestion processing within the same time window.
End-to-End Latency: Achieved under 10 seconds, avoiding adverse selection.
Implementation Overview
- Day One Architecture: Initial setup and components used for deploying the optimized sentiment analysis solution.
Day One Architecture and Duplication Handling in Sentiment Analysis
Day One Architecture
News Ingestion: News streaming API feeds data into an X-ray bucket, triggering a Lambda function for classification based on asset, urgency, and sentiment.
Metadata Tagging: Lambda function triggers a deepseek model to tag metadata into databases like Aurora Prospects and OpenSearch.
User Interaction: QCLI terminal allows traders and analysts to query news sources using LLMs like Claude for specific information (e.g., latest news on Trump announcements or Elon Musk's tweets).
Duplication Challenge in Crypto News
Issue: Same news reported multiple times across various platforms (Twitter, Reddit, Discord, Telegram) within minutes.
Cost and Latency: Processing every duplicate through expensive LLMs increases costs and latency.
Duplication Pipeline
Step 1: Embeddings
- Calculate embeddings in Lambda using embedding models like VGEM3.
Step 2: Similarity Check:
If similarity > 0.75: Mark as duplicate and insert into duplicated collections in OpenSearch to avoid processing.
If similarity < 0.5: Likely unique; check against unique collections in OpenSearch for historical news.
Step 3: Analysis:
- If truly unique, perform near real-time predictions, spread widening recommendations, asset impact assessment, and price movement probability.
Step 4: Generate Prediction Report:
- Send a report to the trader deck select channel for human decision on action.
Fine-Tuning Embeddings
Generic embedding models may not effectively handle crypto-specific duplicates.
Importance of Fine-Tuning: Illustrated by a scatter plot showing the performance of an out-of-the-box BGEM3 on thousands of query document pairs from crypto news.
Fine-Tuning Embeddings and Agentic Architectures
Initial Scatter Plot:
Green Dots: Duplicate news articles with high similarity scores (>0.75).
Red Dots: Non-duplicate news articles with low similarity scores (<0.5).
Muddy Middle (Orange): Overlap between 0.5 and 0.75 indicating ineffective separation.
Fine-Tuned BGEM3:
After fine-tuning on thousands of labeled crypto news articles, clear separation between green and red dots.
Clean separation between 0.3 to 0.6 similarity scores, eliminating the muddy middle.
Fine-tuned tiny embedding models with 560 million parameters.
Agentic Architectures and Prompt Engineering
- No need for fine-tuning LLMs; prompt engineering suffices.
Power of Agentic Architectures:
Hierarchical task decomposition using general reasoning models like Cock.
Specialized embedding layer for cost-effective and faster time-to-market results.
Bias elimination through architecture.
Teaching system to reason about novel events.
Quick Demo
Trader Decks: Right-hand side shows news channels.
Left-hand side: Streaming news analyzed by the pipeline.
Examples:
SEC filings: Analyzed but not sent to traders as not market-moving.
Routine news: Classified as non-impactful.
Impactful news: E.g., Trump's tweet about China becoming hostile, identified as impactful and sent to traders for decision on spread adjustment.
Key Takeaways
Use of agentic architectures over fine-tuning.
Hierarchical task decomposition with general reasoning models.
Specialized embedding layer for cost-effectiveness and faster results.
Human-in-the-loop with trader feedback loop for continuous system improvement.
24/7 real-time coverage with human oversight.
Top comments (0)