DEV Community

retrovirusretro
retrovirusretro

Posted on

The RAG tool that auto-generates Q&A pairs from your documents

Title:

The RAG tool that auto-generates Q&A pairs from your documents
Tags:

ai, docker, ollama, selfhosted
Body:

Most RAG tools split your documents into chunks and embed them. FastGPT does something smarter: it uses an LLM to read your documents and generate question-answer pairs automatically.

27K GitHub stars. Visual workflow builder. Almost no English integration content. Here's the guide.


What is FastGPT?

FastGPT is an LLM-based knowledge platform with two standout features:

1. QA-pair extraction

Instead of naive chunking, FastGPT reads your document with an LLM and extracts pairs like:

  • Q: What is the return window? → A: 30 days from purchase with original receipt.
  • Q: Which payment methods are accepted? → A: Visa, Mastercard, PayPal.

These pairs are what gets embedded and retrieved. At query time, question matches question — dramatically more accurate than matching a question against a random document chunk.

Enable it: Dataset → Upload → Processing Mode → QA Split (not Simple Split).

2. Visual workflow builder

FastGPT has a node editor for building branching RAG pipelines without code. Classify intent → route to FAQ or document search → format output. Each step is a configurable node.


⚠️ License Notice

FastGPT uses its own license that prohibits reselling it as a SaaS service to others.

✅ Self-hosted for your own team — OK
✅ As a backend component in your own product — OK
❌ Selling FastGPT as a service to customers — not permitted

If you need commercial freedom, use MaxKB (Apache 2.0) or WeKnora (MIT) instead.


Setup

Clone the repo, copy .env.example to .env, then docker compose up -d. Open localhost:3000root / 1234.

FastGPT needs MongoDB (conversation storage) and PostgreSQL with pgvector (vector search). Both are included in the docker-compose.

Full docker-compose: fastgpt-production-stack


Connecting Ollama

FastGPT uses the OpenAI-compatible API that Ollama provides at /v1.

Settings → AI Models → Add:

  • Provider: OpenAI Compatible
  • Base URL: http://ollama:11434/v1
  • API Key: ollama (any non-empty string works)
  • Model name: llama3

QA-Pair Extraction vs Simple Chunking

This is FastGPT's real advantage. A concrete example:

Your document says: "Returns are accepted within 30 days of the original purchase date, provided the item is in original condition with all packaging."

Simple chunking embeds that sentence as-is. When a user asks "Can I return something after 3 weeks?" the retrieval depends on semantic similarity between the question and that chunk.

QA-pair extraction creates: Q: "What is the return deadline?" → A: "30 days from purchase." Now the question directly matches a question — much higher retrieval confidence.

For knowledge bases where accuracy matters more than speed, this technique consistently outperforms naive chunking.


FastGPT vs the Alternatives

FastGPT WeKnora MaxKB RAGFlow
QA-pair extraction
Visual workflow
License Custom* MIT Apache 2.0 Apache 2.0
Setup complexity Medium Medium Easy Hard
PDF table parsing Good Basic Basic Excellent

Pick FastGPT when: you want QA-pair extraction for maximum accuracy, or you need a visual pipeline builder for complex routing logic.


Production Deployment

FastGPT needs Nginx + SSL for a real domain. The production docker-compose with Nginx config and Let's Encrypt instructions:

fastgpt-production-stack


Full Series

This is the last article in the Chinese AI tools series:

Meta repo with Docker Compose + Ollama + n8n for all five:
chinese-ai-tools-english-guide


QA-pair extraction is underrated. If you're building a customer support bot or internal knowledge base, try it once — going back to naive chunking feels wrong after.

Top comments (0)