Title:
The RAG tool that auto-generates Q&A pairs from your documents
Tags:
ai, docker, ollama, selfhosted
Body:
Most RAG tools split your documents into chunks and embed them. FastGPT does something smarter: it uses an LLM to read your documents and generate question-answer pairs automatically.
27K GitHub stars. Visual workflow builder. Almost no English integration content. Here's the guide.
What is FastGPT?
FastGPT is an LLM-based knowledge platform with two standout features:
1. QA-pair extraction
Instead of naive chunking, FastGPT reads your document with an LLM and extracts pairs like:
- Q: What is the return window? → A: 30 days from purchase with original receipt.
- Q: Which payment methods are accepted? → A: Visa, Mastercard, PayPal.
These pairs are what gets embedded and retrieved. At query time, question matches question — dramatically more accurate than matching a question against a random document chunk.
Enable it: Dataset → Upload → Processing Mode → QA Split (not Simple Split).
2. Visual workflow builder
FastGPT has a node editor for building branching RAG pipelines without code. Classify intent → route to FAQ or document search → format output. Each step is a configurable node.
⚠️ License Notice
FastGPT uses its own license that prohibits reselling it as a SaaS service to others.
✅ Self-hosted for your own team — OK
✅ As a backend component in your own product — OK
❌ Selling FastGPT as a service to customers — not permitted
If you need commercial freedom, use MaxKB (Apache 2.0) or WeKnora (MIT) instead.
Setup
Clone the repo, copy .env.example to .env, then docker compose up -d. Open localhost:3000 → root / 1234.
FastGPT needs MongoDB (conversation storage) and PostgreSQL with pgvector (vector search). Both are included in the docker-compose.
Full docker-compose: fastgpt-production-stack
Connecting Ollama
FastGPT uses the OpenAI-compatible API that Ollama provides at /v1.
Settings → AI Models → Add:
- Provider: OpenAI Compatible
- Base URL:
http://ollama:11434/v1 - API Key:
ollama(any non-empty string works) - Model name:
llama3
QA-Pair Extraction vs Simple Chunking
This is FastGPT's real advantage. A concrete example:
Your document says: "Returns are accepted within 30 days of the original purchase date, provided the item is in original condition with all packaging."
Simple chunking embeds that sentence as-is. When a user asks "Can I return something after 3 weeks?" the retrieval depends on semantic similarity between the question and that chunk.
QA-pair extraction creates: Q: "What is the return deadline?" → A: "30 days from purchase." Now the question directly matches a question — much higher retrieval confidence.
For knowledge bases where accuracy matters more than speed, this technique consistently outperforms naive chunking.
FastGPT vs the Alternatives
| FastGPT | WeKnora | MaxKB | RAGFlow | |
|---|---|---|---|---|
| QA-pair extraction | ✅ | ❌ | ❌ | ❌ |
| Visual workflow | ✅ | ❌ | ❌ | ❌ |
| License | Custom* | MIT | Apache 2.0 | Apache 2.0 |
| Setup complexity | Medium | Medium | Easy | Hard |
| PDF table parsing | Good | Basic | Basic | Excellent |
Pick FastGPT when: you want QA-pair extraction for maximum accuracy, or you need a visual pipeline builder for complex routing logic.
Production Deployment
FastGPT needs Nginx + SSL for a real domain. The production docker-compose with Nginx config and Let's Encrypt instructions:
Full Series
This is the last article in the Chinese AI tools series:
- 5 Chinese AI tools the West is ignoring — start here
- WeKnora — Tencent's RAG framework
- MaxKB — simplest self-hosted RAG
- DB-GPT — chat with your database
Meta repo with Docker Compose + Ollama + n8n for all five:
→ chinese-ai-tools-english-guide
QA-pair extraction is underrated. If you're building a customer support bot or internal knowledge base, try it once — going back to naive chunking feels wrong after.
Top comments (0)