Prototype and deploy multi-agent and RAG applications with a visual drag-and-drop interface - all running locally with your own models.
Langflow is an open-source visual framework for building AI applications. Connect it to Ollama for local inference, and you get a powerful environment for designing agent architectures, RAG pipelines, and chatbot workflows without writing code.
What You Need
- A GPU with 12GB+ VRAM (or CPU-only for prototyping)
- Docker or Python 3.10+
- About 15 minutes
Architecture
| Component | Role |
|---|---|
| Langflow | Visual drag-and-drop flow builder and API server |
| Ollama | Serves local LLM models |
| Qwen3 14B | Default model - fits 12GB at Q4 |
Setup
Option A: Docker (Recommended)
Save this as docker-compose.yml:
services:
ollama:
image: ollama/ollama:latest
container_name: ollama
volumes:
- ollama:/root/.ollama
ports:
- "11434:11434"
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
restart: unless-stopped
langflow:
image: langflowai/langflow:latest
container_name: langflow
depends_on:
- ollama
ports:
- "7860:7860"
volumes:
- langflow_data:/app/langflow
environment:
- LANGFLOW_AUTO_LOGIN=true
restart: unless-stopped
volumes:
ollama:
langflow_data:
Launch it:
docker compose up -d
docker exec ollama ollama pull qwen3:14b
Open http://localhost:7860 to access Langflow.
Option B: pip Install
pip install langflow
langflow run
# In another terminal:
ollama pull qwen3:14b
Open http://localhost:7860.
Connect Langflow to Ollama
In the Langflow canvas, add:
- Ollama Chat Model component - Base URL: http://ollama:11434 (Docker) or http://localhost:11434 (pip)
- Select model: qwen3:14b
- Connect to a Prompt node and Chat Output for a basic chatbot
What You Can Build
RAG Chatbot
Drag in: File > Ollama Embeddings > Vector Store (Chroma) > Ollama Chat Model > Chat Output. Upload a PDF, ask questions - answers come from your documents.
Multi-Agent Research System
Add an Agent node with a Web Search Tool + Ollama, add a second Agent for summarization. One agent gathers info, the other condenses it.
Document Processing Pipeline
Combine File Loader > Splitter > Ollama Embeddings > Vector Store. Add Ollama Chat Model with custom prompts for Q&A over your documents.
Cost vs Cloud
| Local Langflow + Ollama | Langflow Cloud + OpenAI | |
|---|---|---|
| Monthly | $0 | $50-200+ |
| Hardware | ~$300-600 once | $0 |
| Data privacy | Stays on your machine | Sent to cloud |
| AI calls | Unlimited, free | Per-token billing |
Full guide with detailed troubleshooting and alternatives: https://everylocalai.com/stack/langflow-ollama-rag-agent
Top comments (0)