DEV Community

Victorjia
Victorjia

Posted on

How I Built a Self-Optimizing Arbitrage Engine with Python and Claude AI

I recently open-sourced my global arbitrage engine — a system that scans price gaps across international markets, generates content, and optimizes itself every 6 hours. Here's how it works under the hood.

GitHub: github.com/victorjzq/global-arbitrage-api


The Problem

A kids' coding robot costs ¥45 ($6) wholesale on 1688 (China's Alibaba) but sells for $22 on Shopee Vietnam. That's a 3.5x markup. Why does this gap exist?

Three barriers: language (1688 is Chinese-only), payment (requires Alipay), and discovery (Vietnamese sellers can't search Chinese platforms). AI eliminates all three.

I wanted a system that finds these gaps automatically — and gets smarter over time.

Architecture: 12 Engines in a Perpetual Loop

Scanners (3)          Engine              Output
┌──────────────┐     ┌──────────┐     ┌──────────────┐
│ Trend Gap    │────▶│ Perpetual│────▶│ API (24/7)   │
│ Price Scan   │     │ Engine   │     │ Reports      │
│ Polymarket   │     │ (6h cycle)│    │ Content (5x) │
└──────────────┘     └────┬─────┘     │ Telegram Bot │
                          │           └──────────────┘
                    ┌─────▼─────┐
                    │ Evolution │  ← self-optimization
                    │ Loop      │    from own data
                    └───────────┘
Enter fullscreen mode Exit fullscreen mode

The perpetual engine orchestrates everything. Every 6 hours it runs a full cycle:

  1. Scan — 3 scanners find opportunities across 26 categories and 9 trade routes
  2. Rank — Score each opportunity by ROI × confidence × urgency
  3. Generate — Turn findings into content for 5 platforms
  4. Evolve — Analyze results and adjust weights for next cycle

Here's the actual orchestrator code:

# perpetual_engine.py — the heartbeat
def main():
    cycle = {"start": datetime.now().isoformat(), "scanners": {}}

    # Phase 1: Scan for opportunities
    scanners = [
        ("Trend Gap Scanner", SRC / "trend_gap_scanner.py"),
        ("Price Scanner",     SRC / "daily_scan.py"),
        ("Polymarket",        SRC / "prediction-markets/polymarket_scanner.py"),
    ]
    for name, path in scanners:
        ok, output = run_script(name, path)
        cycle["scanners"][name] = {"success": ok}

    # Phase 2: Generate multi-platform content
    run_script("Content Engine", SRC / "content_engine.py")

    # Phase 3: Self-optimize
    run_script("Evolution Loop", SRC / "evolution_loop.py")

    # Phase 4: Record metrics for next evolution
    record_metrics(cycle)
Enter fullscreen mode Exit fullscreen mode

The Self-Optimization Loop (The Interesting Part)

Most automation scripts do the same thing every time. This system learns from its own output.

The evolution engine analyzes historical scan data to find patterns:

# evolution_loop.py — the system improves itself
def analyze_opportunities():
    patterns = {
        "high_markup_products": [],     # Which products have 3x+ margins?
        "trending_categories": [],       # What's growing fastest?
        "best_source_markets": Counter(),# Which source markets are most profitable?
        "best_target_markets": Counter(),# Which target markets convert best?
        "price_ranges": defaultdict(list),# What price range has best ROI?
    }
    # ... analyzes all historical data files
    return patterns

def generate_optimization_recommendations(patterns, content_stats):
    recs = []
    # Focus on highest-markup products
    if patterns["high_markup_products"]:
        top = sorted(patterns["high_markup_products"],
                     key=lambda x: x.get("markup", 0), reverse=True)[:3]
        recs.append({
            "type": "focus_products",
            "action": "Increase scan frequency",
            "targets": [t["product"] for t in top],
        })
    return recs

def update_scanning_weights(recommendations):
    weights = {
        "product_priority": [],
        "market_priority": ["VN", "TH", "ID", "PH"],
        "scan_frequency": {"trend_gap": "6h", "price_comparison": "6h"},
    }
    for rec in recommendations:
        if rec["type"] == "focus_products":
            weights["product_priority"] = rec["targets"]
    # Saved to disk — next scan cycle picks this up automatically
    with open(weights_file, "w") as f:
        json.dump(weights, f, indent=2)
Enter fullscreen mode Exit fullscreen mode

Every cycle, the system:

  1. Reads its own past results
  2. Identifies what worked (high margins, trending categories)
  3. Writes new weights to disk
  4. Next scan cycle reads those weights and focuses accordingly

The result: after a few dozen cycles, the system naturally converges on the most profitable product categories and markets without any manual tuning.

Opportunity Scoring

Not all price gaps are worth acting on. The ranker scores every opportunity:

# opportunity_ranker.py
# score = ROI_potential * confidence * urgency

def estimate_roi(opp):
    markup = parse_markup(opp.get('markup', 1))
    roi_raw = markup - 1  # 3x markup = 200% ROI
    return normalize(roi_raw, 0, 3) * 10  # Scale to 0-10

def dedup_key(opp):
    # Hash-based deduplication across scanners
    raw = opp.get('keyword_cn', '') + opp.get('keyword_vn', '')
    return hashlib.md5(raw.encode()).hexdigest()[:12]
Enter fullscreen mode Exit fullscreen mode

This prevents the same opportunity from showing up from multiple scanners and surfaces only the highest-value leads.

The 12 Engines

Engine What it does
trend_gap_scanner Finds products hot in China but not yet in SEA
daily_scan Price comparison across platforms
polymarket_scanner Prediction market arbitrage (2000+ markets)
arbitrage_api REST API — 26 categories, 9 trade routes
content_engine 1 data point → Twitter + LinkedIn + Reddit + Email + Video
opportunity_ranker Scores by ROI × confidence × urgency
evolution_loop Self-optimization from own output
perpetual_engine Orchestrator — runs every 6h
publish_report Multi-channel distribution
telegram_bot Subscription alerts
system_status One-command dashboard
md_to_html Reports → sellable HTML/PDF

Fork It and Make Money

git clone https://github.com/victorjzq/global-arbitrage-api.git
cd global-arbitrage-api
pip3 install requests pytrends

# Run a full cycle: scan → rank → content → evolve
python3 src/perpetual_engine.py

# Start the API
python3 src/api_server.py
# → http://localhost:8899/api/top

# Deploy 24/7
bash start.sh
Enter fullscreen mode Exit fullscreen mode

The system is designed to be extended:

  • Add new markets: Africa, Latin America, Middle East
  • Add new scanners: new platforms, new data sources
  • Improve the evolution algorithm: add ML models, A/B test strategies
  • Add monetization: the API is ready for RapidAPI, reports for Gumroad

Real Results

Here's what the system found in its latest scan (March 2026):

Product China (1688) Vietnam (Shopee) Markup
Kids STEM Robot ¥45 ($6) 550k VND ($22) 3.5x
Pet GPS Tracker ¥38 ($5) 450k VND ($18) 3.4x
Foldable Keyboard ¥28 ($4) 320k VND ($13) 3.3x
Solar WiFi Camera ¥75 ($10) 850k VND ($34) 3.2x

These aren't theoretical — the scanner finds new opportunities every 6 hours and ranks them by actionability.

Lessons Learned

  1. Self-optimization beats manual tuning. Let the system tell you what's working.
  2. Deduplication is critical. Multiple scanners will find the same opportunity — hash-based dedup prevents noise.
  3. The perpetual loop pattern is reusable. Scan → Score → Act → Learn works for any data pipeline, not just arbitrage.
  4. Start with the orchestrator. Build perpetual_engine.py first, then plug in scanners one at a time.

The full code is MIT licensed. Fork it, extend it, build your own arbitrage engine.

GitHub: github.com/victorjzq/global-arbitrage-api

Newsletter (free): victorjia.substack.com

If you have questions about the architecture or want to contribute, drop a comment or open an issue on GitHub.

Top comments (0)