Med Marrouchi

Posted on Jun 8

The 0$ AI Achitecture Stack (2026)

#webdev #ai #programming #javascript

AI is getting expensive.

Not only because of model APIs, GPU bills, vector databases, cloud platforms, observability tools, and managed services, but also because we often start building with the most expensive architecture before we understand the problem.

But here is the good news: today, a software engineer can learn, prototype, and even launch serious AI systems with a $0 software stack.

Of course, “$0” does not mean magic.

You still pay for hardware, electricity, domains, bandwidth, production servers, or paid APIs when you scale. But for learning, prototypes, internal tools, demos, MVPs, and self-hosted experiments, there is now a powerful free-to-start ecosystem.

This is the AI architecture stack every AI/software engineer should know.

Think: AI architecture is not just “call an LLM”

The first mistake many teams make is thinking an AI product is just:

frontend → API → OpenAI call → response

That works for a demo.

It does not work for a real system.

A real AI system usually has multiple layers:

Frontend layer — where users interact with the system.
Backend/API layer — where business logic lives.
Agent/workflow/orchestration layer — where tasks are planned, routed, automated, and controlled.
LLM layer — where models are served, routed, or accessed.
AI coding agent layer — where developers accelerate development.
Data and RAG layer — where knowledge, context, memory, and embeddings live.
Deployment and operations layer — where the system runs, scales, and stays secure.

The best AI architecture is not the one with the most tools.

It is the one where each layer can be understood, replaced, self-hosted, monitored, and improved.

Feel: you do not need permission to start building AI systems

You do not need an enterprise license to understand agent architecture.

You do not need a huge cloud budget to experiment with RAG.

You do not need to wait for procurement to build an internal automation tool.

You can start with local models, open-source databases, free frameworks, self-hosted workflow engines, free deployment tiers, and your own machine.

That should make you feel three things:

In control — because you can inspect the stack.

Independent — because you are not locked into one vendor from day one.

Pragmatic — because you can move from $0 prototype to production only when the use case deserves it.

The point is not to avoid paying forever.

The point is to avoid paying before you understand what you are building.

Do: choose one tool per layer and build a vertical slice

Below is a practical map of free, open-source, source-available, fair-core, self-hostable, or free-tier tools that every AI engineer should know.

You do not need all of them.

Pick one per layer and build something end to end.

1. Frontend layer

This is the interface between your users and your AI system.

Technology	Use it for	Free angle
React	Component-based user interfaces	Open-source UI library
Next.js	Full-stack React apps, SSR, API routes, AI apps	Open-source framework with free hosting options
Vite	Fast frontend tooling and SPAs	Open-source build tool
Vue	Progressive frontend applications	Open-source framework
Nuxt	Full-stack Vue applications	Open-source framework
SvelteKit	Lightweight full-stack web apps	Open-source framework
Tailwind CSS	Fast UI styling	Open-source CSS framework
shadcn/ui	Copy-paste React components	Open-source component system

A simple default choice:

Next.js + Tailwind CSS + shadcn/ui

This gives you a modern UI, good developer experience, and a smooth path to AI chat interfaces, dashboards, admin panels, and workflow builders.

2. Backend/API layer

This layer exposes your business logic, user management, integrations, and internal services.

Technology	Use it for	Free angle
Node.js	JavaScript/TypeScript backend runtime	Open-source runtime
NestJS	Structured enterprise-grade Node.js APIs	Open-source framework
FastAPI	Python APIs for AI and ML systems	Open-source framework
Express	Minimal Node.js APIs	Open-source framework
Fastify	Fast Node.js APIs	Open-source framework
Hono	Lightweight APIs for edge/serverless runtimes	Open-source framework

A simple default choice:

NestJS if your team is TypeScript-heavy.
FastAPI if your AI logic is Python-heavy.

3. AI agent, workflow, and automation layer

This is where AI systems become more than chat.

This layer helps you connect tools, call APIs, automate workflows, add human approval, manage steps, and control agent behavior.

Technology	Use it for	Free angle
Hexabot	Self-hosted AI chatbot and workflow automation platform	Fair-core, self-hosted, free-to-start
n8n	Workflow automation with visual flows and integrations	Source-available / fair-code, self-hostable
LangGraph	Stateful, long-running AI agents	Open-source framework
CrewAI	Multi-agent orchestration	Open-source framework
Vercel AI SDK	TypeScript AI apps, chat, streaming, tool calls	Open-source SDK
Flowise	Visual AI agent and LLM workflow builder	Open-source / self-hostable
Dify	LLM app development, workflows, RAG, agents	Open-source / self-hostable
Haystack	RAG pipelines and agentic AI applications	Open-source framework

A simple default choice:

For visual AI workflow automation: Hexabot, n8n, Flowise, or Dify.
For code-first agents: LangGraph, CrewAI, Haystack, or Vercel AI SDK.

A practical architecture could be:

Hexabot for workflows and channels
LiteLLM for model routing
Ollama for local models
Postgres for state
Redis for queues/cache
Chroma or pgvector for embeddings

4. LLM layer

This layer is responsible for running, serving, routing, or accessing models.

Technology	Use it for	Free angle
Ollama	Running local models easily	Free local runtime
vLLM	High-throughput LLM serving	Open-source inference server
LiteLLM	LLM gateway and provider abstraction	Open-source proxy/SDK
llama.cpp	Running LLMs efficiently on local hardware	Open-source runtime
Hugging Face Transformers	Model loading, fine-tuning, inference	Open-source library
Open WebUI	Local/private chat UI for LLMs	Open-source UI
Text Generation Inference	Serving open LLMs in production	Open-source inference server

A simple default choice:

Ollama for local development.
LiteLLM when you want to switch between local models and paid providers.
vLLM when you need serious inference serving.

Important note:

Local models can make your software cost $0, but not your compute cost $0.

Your laptop, GPU, VPS, or server still matters.

5. AI coding agent layer

This is the layer that helps you build the stack faster.

Some tools can run with paid models, local models, or your own provider setup.

Technology	Use it for	Free angle
OpenCode	Terminal-based AI coding agent	Open-source / free models or bring your own
Aider	AI pair programming in the terminal	Open-source
Cline	AI coding agent inside editor/terminal workflows	Open-source
OpenHands	Autonomous software development agents	Open-source foundation
Continue	AI coding checks and coding assistance	Free-to-start / open-source roots

A realistic $0 coding setup:

OpenCode or Aider + Ollama + a local coding model

Will it replace a senior engineer?

No.

Can it help you scaffold actions, tests, docs, API routes, workflows, and refactors?

Absolutely.

6. Data and RAG layer

This is where AI systems become useful.

Without data, context, memory, retrieval, and grounding, your AI system is just guessing.

Technology	Use it for	Free angle
PostgreSQL	Main relational database	Open-source database
SQLite	Local/dev embedded database	Public-domain database engine
Redis	Cache, queues, real-time state, vector features	Open-source option available
pgvector	Vector search inside Postgres	Open-source extension
Chroma	Vector database for AI apps	Open-source / free cloud credits
Qdrant	Vector search engine	Open-source / free cloud tier
LlamaIndex	RAG and data framework for LLM apps	Open-source framework
MindsDB	AI over federated data sources	Open-source / self-hostable options
DuckDB	Local analytical database	Open-source database
MinIO	S3-compatible object storage	Open-source object storage

A simple default choice:

PostgreSQL + pgvector for production-like apps.
SQLite + Chroma for local prototypes.
Redis when you need queues, cache, sessions, or fast state.

For many MVPs, Postgres is enough.

You can store users, workflows, logs, documents, embeddings, and metadata in one place before introducing more specialized infrastructure.

7. Deployment and operations layer

This is where “it works on my machine” becomes “it works for users”.

Technology	Use it for	Free angle
Docker	Packaging apps and services	Free tooling for many use cases
Docker Compose	Local/self-hosted multi-service stacks	Free tooling
Kubernetes	Container orchestration	Open-source platform
K3s	Lightweight Kubernetes	Open-source distribution
NGINX	Reverse proxy, load balancing, static serving	Open-source
Caddy	Web server with automatic HTTPS	Open-source
Let’s Encrypt	Free TLS certificates	Free certificate authority
Certbot	Automating Let’s Encrypt certificates	Free/open-source tool
GitHub Actions	CI/CD pipelines	Free for public repos and self-hosted runners
GitHub Pages	Static website hosting	Free for public repositories
Cloudflare Pages	Static/frontend hosting	Free tier
Vercel	Frontend and Next.js deployment	Free Hobby plan
Netlify	Frontend/static deployment	Free plan
Prometheus	Metrics and monitoring	Open-source
Grafana	Dashboards and observability	Open-source edition

A simple default choice:

Docker Compose + NGINX + Let’s Encrypt for a small self-hosted deployment.
Vercel, Netlify, Cloudflare Pages, or GitHub Pages for frontend hosting.
Prometheus + Grafana when you need observability.

For production AI systems, do not expose workflow tools, model servers, databases, or local LLM runtimes directly to the public internet without authentication, network restrictions, and monitoring.

Free does not mean careless.

Example $0 architecture recipes

Here are a few realistic starting points.

Recipe 1: Local AI prototype

Use this when you want to build fast on your laptop.

Frontend: Next.js
AI SDK: Vercel AI SDK
LLM runtime: Ollama
Data: SQLite
Vector DB: Chroma
Deployment: Docker Compose

Good for:

Internal demos
Chat with documents
Personal agents
Learning RAG
Local-first AI apps

Recipe 2: Self-hosted AI workflow automation

Use this when you want business workflows, channels, actions, and control.

Workflow engine: Hexabot or n8n
Model gateway: LiteLLM
Local models: Ollama
Database: PostgreSQL
Cache/queue: Redis
Reverse proxy: NGINX
TLS: Let’s Encrypt

Good for:

Customer support automation
Lead qualification
Internal operations
Scheduled AI workflows
Human-in-the-loop automations

Recipe 3: RAG application stack

Use this when your app needs to answer based on your data.

Frontend: React or Next.js
API: FastAPI
RAG framework: LlamaIndex or Haystack
Database: PostgreSQL
Vector search: pgvector or Qdrant
Model runtime: vLLM or Ollama
Monitoring: Prometheus + Grafana

Good for:

Knowledge base assistants
Legal/document search
Technical support assistants
Internal documentation search
Product copilots

What is not really $0?

A few things will eventually cost money:

Production servers
GPUs
Domains
Storage
Bandwidth
Commercial LLM APIs
Email/SMS/WhatsApp providers
Advanced observability
Enterprise support
Security audits
Team collaboration features

And that is fine.

The goal is not to build a serious production company with no budget forever.

The goal is to start with a stack that teaches you the architecture before it charges you for the architecture.

My recommended default stack

If I had to recommend one practical $0 starting stack for an AI engineer today, I would choose:

Frontend: Next.js + Tailwind CSS + shadcn/ui
Backend: NestJS or FastAPI
Workflow/agent layer: Hexabot, LangGraph, or n8n
LLM gateway: LiteLLM
Local models: Ollama
Database: PostgreSQL
Vector search: pgvector or Chroma
Cache/queues: Redis
Deployment: Docker Compose + NGINX + Let’s Encrypt
CI/CD: GitHub Actions
Monitoring: Prometheus + Grafana

This gives you one important thing:

A full AI architecture you can understand from top to bottom.

Final thought

The AI ecosystem is moving fast.

Every week, a new agent framework, vector database, workflow tool, or model provider appears.

But the architecture remains surprisingly stable:

Interface
Logic
Orchestration
Models
Data
Deployment
Monitoring

If you understand those layers, you can swap tools without losing your mind.

The best AI engineers are not the ones who know every tool.

They are the ones who know where each tool belongs.

So pick one layer.

Pick one tool.

Build one vertical slice.

And prove that your AI system works before your cloud bill proves that it does not.

Top comments (2)

Marco Dev • Jun 8

I really like how clearly it breaks down the AI architecture stack layer by layer. I’ll definitely bookmark this. I’d also be curious to see a few additions around observability and evaluation tools, like Langfuse, OpenTelemetry, or Phoenix, since they’re becoming essential for production AI systems.

Erik Paris • Jun 8

Great post. I really like how clearly it maps the different layers of the AI architecture stack. I’ll definitely keep this as a reference for future projects.