DEV Community

jinjihuang88-ui
jinjihuang88-ui

Posted on

How I built an LLM-powered B2B matching engine for China's small commodity export market

I've been building in the Canada-China B2B trade space for a while, and the biggest friction I kept running into was this: global buyers don't know how to find the right Chinese supplier, and Chinese suppliers have no efficient way to reach international buyers.

The traditional approach — Alibaba, trade shows, cold email — is slow, expensive, and heavily relationship-dependent. I wanted to fix this with LLMs.

The Problem

China's small commodity export market (小商品出海) is massive. Yiwu alone processes over $70B in annual wholesale trade. Yet a Canadian retailer trying to source bamboo kitchenware, or an Australian importer looking for OEM pet toys, has no good way to describe what they want and get matched with the right factory.

Search engines return SEO spam. Alibaba is a catalogue you have to manually browse. Trade shows require flights to Guangzhou. The entire process assumes you already know who you're looking for.

The Approach: Intent-Based Semantic Matching

Instead of keyword search, I built an intent graph — a dual-sided store of buyer demands (DEMAND) and supplier capabilities (SUPPLY).

When a buyer submits a requirement like:

"Need 500 units bamboo cutting boards for Canadian retail, FSC certified, budget $8-12 USD"

The LLM parser extracts structured fields: product type, quantity, certifications, market, budget. This becomes a DEMAND intent.

On the other side, supplier data (crawled + manually verified) is stored as SUPPLY intents with product categories, MOQ, certifications, and export experience.

The matching engine compares intents using:

  • Category alignment — hierarchical taxonomy filter
  • Semantic similarity — embedding cosine similarity between core_need fields
  • Structural compatibility — quantity vs MOQ, budget vs price range, certifications
  • Supplier quality score — verification status, past match success rate

Pairs scoring ≥ 0.7 trigger email notifications to both parties.

Tech Stack

  • Backend: FastAPI + SQLite
  • LLM parsing: GPT-4o-mini (English) + QWEN qwen-plus (Chinese) — smart language routing
  • Supplier discovery: Self-hosted SearXNG + BeautifulSoup crawler + AI validation
  • Webhook API: accepts buyer demands from AI agents, Telegram bot, or direct API
  • Deployment: Docker Compose on Alibaba Cloud ECS

The Hard Part: Supply-Side Data Quality

The hard part wasn't the embeddings — it was data quality. Most supplier websites are SEO-optimized but content-poor. The AI validation step rejects ~70% of crawled URLs as not genuine B2B suppliers.

Open API

curl -X POST https://maplebridge.io/api/v1/webhook/manus \
  -H "Content-Type: application/json" \
  -d '{
    "demand": "1000 units wireless earbuds, CE certified, for Canadian market",
    "contact_email": "buyer@company.com",
    "source": "api"
  }'
Enter fullscreen mode Exit fullscreen mode

Full docs open-sourced at: https://github.com/jinjihuang88-ui/maplebridge-open

Launched today on Product Hunt: https://www.producthunt.com/posts/maplebridge-io — free for buyers. Happy to answer questions about the LLM matching architecture.

Top comments (0)