I recently open-sourced my global arbitrage engine — a system that scans price gaps across international markets, generates content, and optimizes itself every 6 hours. Here's how it works under the hood.
GitHub: github.com/victorjzq/global-arbitrage-api
The Problem
A kids' coding robot costs ¥45 ($6) wholesale on 1688 (China's Alibaba) but sells for $22 on Shopee Vietnam. That's a 3.5x markup. Why does this gap exist?
Three barriers: language (1688 is Chinese-only), payment (requires Alipay), and discovery (Vietnamese sellers can't search Chinese platforms). AI eliminates all three.
I wanted a system that finds these gaps automatically — and gets smarter over time.
Architecture: 12 Engines in a Perpetual Loop
Scanners (3) Engine Output
┌──────────────┐ ┌──────────┐ ┌──────────────┐
│ Trend Gap │────▶│ Perpetual│────▶│ API (24/7) │
│ Price Scan │ │ Engine │ │ Reports │
│ Polymarket │ │ (6h cycle)│ │ Content (5x) │
└──────────────┘ └────┬─────┘ │ Telegram Bot │
│ └──────────────┘
┌─────▼─────┐
│ Evolution │ ← self-optimization
│ Loop │ from own data
└───────────┘
The perpetual engine orchestrates everything. Every 6 hours it runs a full cycle:
- Scan — 3 scanners find opportunities across 26 categories and 9 trade routes
-
Rank — Score each opportunity by
ROI × confidence × urgency - Generate — Turn findings into content for 5 platforms
- Evolve — Analyze results and adjust weights for next cycle
Here's the actual orchestrator code:
# perpetual_engine.py — the heartbeat
def main():
cycle = {"start": datetime.now().isoformat(), "scanners": {}}
# Phase 1: Scan for opportunities
scanners = [
("Trend Gap Scanner", SRC / "trend_gap_scanner.py"),
("Price Scanner", SRC / "daily_scan.py"),
("Polymarket", SRC / "prediction-markets/polymarket_scanner.py"),
]
for name, path in scanners:
ok, output = run_script(name, path)
cycle["scanners"][name] = {"success": ok}
# Phase 2: Generate multi-platform content
run_script("Content Engine", SRC / "content_engine.py")
# Phase 3: Self-optimize
run_script("Evolution Loop", SRC / "evolution_loop.py")
# Phase 4: Record metrics for next evolution
record_metrics(cycle)
The Self-Optimization Loop (The Interesting Part)
Most automation scripts do the same thing every time. This system learns from its own output.
The evolution engine analyzes historical scan data to find patterns:
# evolution_loop.py — the system improves itself
def analyze_opportunities():
patterns = {
"high_markup_products": [], # Which products have 3x+ margins?
"trending_categories": [], # What's growing fastest?
"best_source_markets": Counter(),# Which source markets are most profitable?
"best_target_markets": Counter(),# Which target markets convert best?
"price_ranges": defaultdict(list),# What price range has best ROI?
}
# ... analyzes all historical data files
return patterns
def generate_optimization_recommendations(patterns, content_stats):
recs = []
# Focus on highest-markup products
if patterns["high_markup_products"]:
top = sorted(patterns["high_markup_products"],
key=lambda x: x.get("markup", 0), reverse=True)[:3]
recs.append({
"type": "focus_products",
"action": "Increase scan frequency",
"targets": [t["product"] for t in top],
})
return recs
def update_scanning_weights(recommendations):
weights = {
"product_priority": [],
"market_priority": ["VN", "TH", "ID", "PH"],
"scan_frequency": {"trend_gap": "6h", "price_comparison": "6h"},
}
for rec in recommendations:
if rec["type"] == "focus_products":
weights["product_priority"] = rec["targets"]
# Saved to disk — next scan cycle picks this up automatically
with open(weights_file, "w") as f:
json.dump(weights, f, indent=2)
Every cycle, the system:
- Reads its own past results
- Identifies what worked (high margins, trending categories)
- Writes new weights to disk
- Next scan cycle reads those weights and focuses accordingly
The result: after a few dozen cycles, the system naturally converges on the most profitable product categories and markets without any manual tuning.
Opportunity Scoring
Not all price gaps are worth acting on. The ranker scores every opportunity:
# opportunity_ranker.py
# score = ROI_potential * confidence * urgency
def estimate_roi(opp):
markup = parse_markup(opp.get('markup', 1))
roi_raw = markup - 1 # 3x markup = 200% ROI
return normalize(roi_raw, 0, 3) * 10 # Scale to 0-10
def dedup_key(opp):
# Hash-based deduplication across scanners
raw = opp.get('keyword_cn', '') + opp.get('keyword_vn', '')
return hashlib.md5(raw.encode()).hexdigest()[:12]
This prevents the same opportunity from showing up from multiple scanners and surfaces only the highest-value leads.
The 12 Engines
| Engine | What it does |
|---|---|
trend_gap_scanner |
Finds products hot in China but not yet in SEA |
daily_scan |
Price comparison across platforms |
polymarket_scanner |
Prediction market arbitrage (2000+ markets) |
arbitrage_api |
REST API — 26 categories, 9 trade routes |
content_engine |
1 data point → Twitter + LinkedIn + Reddit + Email + Video |
opportunity_ranker |
Scores by ROI × confidence × urgency |
evolution_loop |
Self-optimization from own output |
perpetual_engine |
Orchestrator — runs every 6h |
publish_report |
Multi-channel distribution |
telegram_bot |
Subscription alerts |
system_status |
One-command dashboard |
md_to_html |
Reports → sellable HTML/PDF |
Fork It and Make Money
git clone https://github.com/victorjzq/global-arbitrage-api.git
cd global-arbitrage-api
pip3 install requests pytrends
# Run a full cycle: scan → rank → content → evolve
python3 src/perpetual_engine.py
# Start the API
python3 src/api_server.py
# → http://localhost:8899/api/top
# Deploy 24/7
bash start.sh
The system is designed to be extended:
- Add new markets: Africa, Latin America, Middle East
- Add new scanners: new platforms, new data sources
- Improve the evolution algorithm: add ML models, A/B test strategies
- Add monetization: the API is ready for RapidAPI, reports for Gumroad
Real Results
Here's what the system found in its latest scan (March 2026):
| Product | China (1688) | Vietnam (Shopee) | Markup |
|---|---|---|---|
| Kids STEM Robot | ¥45 ($6) | 550k VND ($22) | 3.5x |
| Pet GPS Tracker | ¥38 ($5) | 450k VND ($18) | 3.4x |
| Foldable Keyboard | ¥28 ($4) | 320k VND ($13) | 3.3x |
| Solar WiFi Camera | ¥75 ($10) | 850k VND ($34) | 3.2x |
These aren't theoretical — the scanner finds new opportunities every 6 hours and ranks them by actionability.
Lessons Learned
- Self-optimization beats manual tuning. Let the system tell you what's working.
- Deduplication is critical. Multiple scanners will find the same opportunity — hash-based dedup prevents noise.
- The perpetual loop pattern is reusable. Scan → Score → Act → Learn works for any data pipeline, not just arbitrage.
-
Start with the orchestrator. Build
perpetual_engine.pyfirst, then plug in scanners one at a time.
The full code is MIT licensed. Fork it, extend it, build your own arbitrage engine.
GitHub: github.com/victorjzq/global-arbitrage-api
Newsletter (free): victorjia.substack.com
If you have questions about the architecture or want to contribute, drop a comment or open an issue on GitHub.
Top comments (0)