Hi everyone!
A month ago, I kept hearing the same complaint from my business partners: "Creating FAQs is such a pain!" They all had tons of content sitting around: docs, guides, policies, ... but turning that into useful FAQs was eating up way too much time. So I thought, why not build something that does this automatically?
That's how Fast Q&A was born. Upload your content, get smart FAQs, embed them anywhere. Simple concept, but the technical implementation? That's where it gets interesting.
What's Coming in This Series?
This isn't just another "I built a thing" post. I'm going to share my story of building Fast Q&A and what I learned along the way.
- Part 1 (this post) - Architecture & Domain design: Introducing the system, tech stack choices, domain boundaries, and why I chose monolithic-first but microservices-ready (or just services-ready 😂)
- Part 2 - Project Structure & User Flow: Code organization, folder structure and a simple flow for user onboarding
- Part 3 - RAG Implementation & Document Processing: The heart of this app. I want to share how documents become searchable knowledge, leverage vector database with PostgreSQL, and making OpenAI play nice with your content.
- Part 4 - Widget Development & Real-time Chat: Building an embeddable widget that works anywhere and creating smooth chat experiences with RAG and LLM.
- Part 5 - SaaS Integration & Going Live: Payment processing, analytics, deployment, and all the boring-but-essential stuff to make your project into a real product.
Tech Stack & Architecture Overview
I want to introduce a quick overview of the tech stack that I used to build Fast Q&A. This is my personal preference and it might be different for you. But here's the gist of it:
- Frontend:
- React, TypeScript, ShadCN
- Thanks to tweakcn for the theme customization
- Backend:
- Golang with Echo v4
- gRPC for service-to-service communication
- Database:
- PostgreSQL for main storage
- Redis for caching & queue
- RAG:
- Open source RAG library rag_api
- Another PostgreSQL instance with pgvector
- LLM:
- OpenAI - I'm using
gpt 4.1 mini
for both embedding and chat completion
- OpenAI - I'm using
- Misc (you can use whatever you like):
- Payment: Lemon Squeezy
- Analytics: Plausible (open source, easy to self-host) for real-time reports
- Mailer: Brevo
- Self notification: Telegram
- IP detection: Cloudflare or ipinfo.io
- Error tracking: Sentry
- Deployment:
- OVH (or any other cloud provider you like) with simple Docker Compose setup
- I will not mention the CI/CD pipeline here since it's not relevant to this series
Quick Notes on the Stack:
- Backend focus: This series dives deep into the backend architecture and how to implement, then, I'll skip the frontend part.
- Why Go?:I'm comfortable with Go, Node, or Python, but the problem is my system relies heavily on 3rd party services, then I want to keep the application server lightweight. Go just fits perfectly for this.
- Production & Development: You'll notice that I mention 2 PostgreSQL instances for different purposes. Actually, one instance is good, but for the development only. You'll want to split them out to make the system runs smoothly in production. Same with Redis (caching and queue).
Domain Design: Building for Tomorrow
To be honest, implementing domain design for this small, simple app might seem like overkill. But here's why I did it anyway: it gives me clear boundaries between different parts of the system, and more importantly, a roadmap for scaling without rewriting everything.
The Five Domains
I organized Fast Q&A into five distinct domains, each with a specific responsibility:
IAM (Identity & Access Management)
- Handle sign-in flow with email and OTP verification
- User account management and authentication
- Keep security concerns isolated from business logic
Project
- Project creation and management
- Widget settings and customization
- Everything related to organizing user content
Knowledge
- The heart of the application: document processing, Q&A generation, chat sessions
- Where all the AI magic happens with RAG and vector operations
- Content lifecycle management
Billing
- Subscription management and payment processing
- Usage tracking and quota enforcement
Common
- Widget public API endpoints
- The bridge between external websites and Fast Q&A's functionality
- Public-facing interfaces that don't require authentication
How They Communicate
Even though everything runs in a single monolithic application, domains communicate through well-defined interfaces:
- gRPC for synchronous calls: When Project needs to check if a user can upload more content, it makes a gRPC call to Billing. This keeps domains decoupled and makes future service extraction straightforward.
- Redis-based queues for async events: Using asynq, when Knowledge finishes processing a document, it queues an event for Project to update usage metrics. Clean, reliable, and scalable.
- Shared database, separate concerns: All domains share the same PostgreSQL instance for now, but each owns its data models completely. No cross-domain database dependencies.
What's Next?
That's the foundation - a clean, scalable architecture that grows with my needs and easy to manage.
In Part 2, I'll show you the actual code structure and how users interact with Fast Q&A. Let me show you how it all comes together in practice.
Want to see it in action? Try Fast Q&A yourself at fastqna.app - there's a free tier to play around with.
I'd love to hear about your architecture decisions in the comments below. See ya!
Top comments (0)