DEV Community

firstdata
firstdata

Posted on

FirstData: Open Knowledge Base of 160+ Authoritative Global Data Sources with MCP Integration

The Problem

AI hallucination is one of the biggest challenges with LLMs today. When your AI agent confidently cites statistics, regulations, or market data β€” where does that data actually come from?

The solution isn't just bigger models. It's giving AI agents access to verified, authoritative primary sources.

What is FirstData?

FirstData is an open-source knowledge base of 160+ curated authoritative data sources from around the world, with a built-in MCP (Model Context Protocol) server for AI-powered discovery.

Think of it as a "trusted data directory" for AI agents.

What's Inside

  • πŸ›οΈ 59 Government sources β€” US Census, Federal Reserve, China NBS, Eurostat
  • 🌐 44 International organizations β€” World Bank, WHO, IMF, WTO, FAO
  • πŸ”¬ 28 Research institutions β€” NBER, CEPR, major universities
  • πŸ“ˆ 14 Market data providers β€” Bloomberg, LSEG
  • πŸ—ΊοΈ 50+ domains across economics, health, environment, education
  • 🌍 Global coverage β€” 67 global, 69 national, 20 regional sources

MCP Integration

FirstData provides a standard MCP server that any MCP-compatible AI agent can connect to:

{
  "mcpServers": {
    "firstdata": {
      "url": "https://firstdata.deepminer.com.cn/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_TOKEN"
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

6 tools available: search sources, get details, browse by domain/country, get statistics, and submit feedback.

Why It Matters

  1. Combat AI Hallucination β€” Point your agents to real, verified data sources
  2. Save Research Time β€” Stop googling for "where to find GDP data" β€” just ask your AI agent
  3. Structured Metadata β€” Every source has standardized JSON with descriptions, API URLs, update frequency, and coverage
  4. Bilingual β€” Full English and Chinese descriptions for every data source
  5. Open Source β€” MIT licensed, community-driven, continuously updated

Use Cases

  • Market Research β€” Find authoritative industry data sources instantly
  • AI Agent Development β€” Give your agents trusted data discovery capabilities
  • Academic Research β€” Discover core datasets in specialized fields
  • Fact-Checking β€” Trace information back to its authoritative origin

Get Started

  1. ⭐ Star the repo: github.com/MLT-OSS/FirstData
  2. Apply for API access: firstdata.deepminer.com.cn
  3. Connect via MCP: Add to Claude Desktop, Cursor, or any MCP client
  4. Contribute: Add new data sources via PR β€” we welcome contributions!

If you find this useful, please ⭐ Star the repo! Contributions welcome.

Open-sourced by Mininglamp Technology (2718.HK)

Top comments (0)