DEV Community

Sean  |   Mnemox
Sean | Mnemox

Posted on

idea-reality-mcp v0.3.0: How We Built Chinese Language Support Into an MCP Server

TL;DR — idea-reality-mcp is an MCP server that checks if your project idea already exists. v0.3.0 adds a 3-stage keyword extraction pipeline and full Chinese/mixed-language support (150+ term mappings across 15+ domains).

The Problem

When users typed ideas in Chinese like LINE Bot 自動客服系統, our v0.2 keyword extraction would either return raw Chinese characters or miss the intent entirely. Every search query was garbage.

For a tool used by Taiwanese developers, this was unacceptable.

The Solution: 3-Stage Pipeline

Stage A: Clean & Map

  • Map Chinese terms to English equivalents (150+ terms)
  • Hard-filter boilerplate words (ai, tool, platform, system...)
  • Normalize hyphens, extract compound terms

Stage B: Intent Anchors

  • Detect 1-2 intent signals from a curated set of 90+ anchors
  • Covers: monitoring, evaluation, agents, RAG, scheduling, billing, scraping, deployment, and more
  • Example: 排程任務管理工具 → anchor: scheduling

Stage C: Synonym Expansion

  • 80+ synonym groups generate 3-8 varied search queries
  • scheduling expands to: cron, job queue, task scheduler, periodic
  • Avoids duplicate words (fixed a bug where redis redis could appear)

Chinese Coverage

The CHINESE_TECH_MAP isn't just tech terms. We mapped 150+ terms across 15+ domains:

  • Tech/SaaS: 監控→monitoring, 爬蟲→scraping, 快取→caching
  • Medical: 中醫→tcm, 針灸→acupuncture, 病歷→medical record
  • Legal: 合約→contract, 律師→lawyer, 判決→court ruling
  • Education: 教學→teaching, 考試→exam, 學習→learning
  • And more: agriculture, aerospace, religion, art, gaming, government...

Key design decisions:

  • Sort by key length (longest first) so 客戶關係 matches before 客戶
  • Never return raw Chinese — if we can't map it, we strip it cleanly
  • 追蹤 maps to tracking (general), not tracing (infra-specific)

Quality Numbers

Metric Result
pytest 93/93 passing
Golden eval (54 ideas) 100% anchor hit rate
Junk ratio 4% average
TW Chinese tests (99 cases) 98%+ pass rate
Chinese char leakage Zero

Try It

# Install
uvx idea-reality-mcp

# Or try online (no install)
# https://mnemox.ai/check
Enter fullscreen mode Exit fullscreen mode

Reality Check web interface with an input field showing an AI code review idea

Reality Signal score of 90 with a red gauge indicating high competition

Evidence grid showing 664,818 GitHub repos, 24 HN posts, and 70,408 top stars with similar projects list

Links


MIT licensed. Built by Mnemox AI in Taipei.

Top comments (0)