This article was originally published on aifoss.dev
---
title: 'Flowise Local Setup Guide: Build AI Workflows Without Python'
description: 'Step-by-step Flowise local setup: install via npm or Docker, connect to Ollama, and build a working RAG chatbot in under 20 minutes — no Python required.'
pubDate: 'May 23 2026'
tags: ["flowise", "ai", "nocode", "selfhosted", "llm"]
Flowise gives you a drag-and-drop interface for building LLM pipelines — RAG chatbots, multi-step agents, document Q&A — without writing a single line of Python. It's Node.js-based, runs locally on modest hardware, and connects to Ollama so your data never leaves your machine.
If you've looked at LangChain or LlamaIndex and thought "this is too much code for what I'm trying to do," Flowise is the answer. If you've looked at n8n and wanted something more AI-native, same answer.
This gets you from zero to a working RAG chatbot on localhost in under 20 minutes.
What Flowise actually is
Flowise is an open-source, self-hostable UI for building LLM applications using a node-based visual editor. Each "node" represents an LLM component — a model, a retriever, a memory store, a tool — and you wire them together on a canvas. The result is a chatflow or agentflow you can embed via iframe, call via API, or just use through the built-in chat interface.
License: Apache 2.0. The code is at FlowiseAI/Flowise on GitHub.
It's built on top of LangChain.js, which means you get LangChain's integrations (dozens of LLM providers, vector stores, document loaders) without writing any JavaScript yourself.
What you can build:
- RAG chatbots that answer questions about your documents
- Multi-agent systems with tool use (web search, code execution, APIs)
- Chatbot embeds for internal tools or websites
- Structured data extraction pipelines
- API-chaining workflows that combine multiple LLM calls
Prerequisites
| Requirement | Minimum | Notes |
|---|---|---|
| Node.js | 18.x or 20.x | v22+ also works; check the repo for current support |
| RAM | 4 GB | 8 GB recommended if running models locally alongside |
| Disk | 2 GB free | More if storing vector embeddings locally |
| OS | Windows, macOS, Linux | All supported; Docker is the easiest cross-platform path |
Flowise itself is lightweight. The hardware pressure comes from whatever models you run through it. For Ollama with Llama 3.2 3B: 4 GB RAM is enough. For anything 7B+: 8 GB RAM minimum, GPU optional but helpful.
Installation: npm
The npm method is the fastest way to get started.
npm install -g flowise
npx flowise start
Open http://localhost:3000. That's the entire install process.
No login required by default — Flowise assumes local single-user mode. To enable authentication:
npx flowise start --FLOWISE_USERNAME=admin --FLOWISE_PASSWORD=yourpassword
To update later: npm update -g flowise.
Node version note: If npx flowise start throws an error about unsupported Node versions, use nvm to switch to Node 20 LTS. This is the most common first-run issue.
Installation: Docker
For a persistent service — or a shared setup where multiple people need access — Docker is cleaner than npm:
docker run -d \
--name flowise \
-p 3000:3000 \
-v ~/.flowise:/root/.flowise \
flowiseai/flowise
The -v flag mounts a volume so your chatflows and vector data persist across container restarts. Without it, everything resets when the container stops.
The official repo includes Docker Compose examples under the docker/ directory. That's the right starting point if you're adding a database (PostgreSQL) or running Flowise alongside other services.
Running Flowise in Docker with Ollama on the host: Point the Ollama base URL to http://host.docker.internal:11434 on Mac or Windows. On Linux, use the host's actual network IP or configure --network=host.
Connecting to Ollama
Once Flowise is running, connecting a local model takes about 30 seconds:
- Confirm Ollama is running:
ollama serve(or check that the Ollama app is open) - Pull a model if you haven't:
ollama pull llama3.2orollama pull mistral - In Flowise, open a new Chatflow
- Drag the ChatOllama node onto the canvas
- Set Base URL to
http://localhost:11434, select your model name from the dropdown
Wire the ChatOllama node to a Conversation Chain node, open the chat panel (bottom right), and test it. If you get a response — you're connected. No API key, no rate limits, fully offline.
Building your first RAG chatbot
This is the workflow Flowise is best known for, and it's genuinely fast to set up once you know which nodes to use.
The node chain:
- PDF File (or Text File) loader — upload your document
- Recursive Character Text Splitter — chunk size 1000, overlap 200 (sensible defaults)
-
Ollama Embeddings — model:
nomic-embed-text(pull it first:ollama pull nomic-embed-text) - In-Memory Vector Store — fine for testing
- Conversational Retrieval QA Chain — this ties together your retriever and LLM, and handles conversation history automatically
Connect left to right: PDF Loader → Text Splitter feeds into Vector Store. Vector Store retriever output feeds into the QA Chain. ChatOllama feeds into the QA Chain as the LLM.
Hit the Upsert button on the vector store node to index your documents. It's the database icon — easy to miss the first time. After upsert completes, open the chat and ask a question about the document.
The first time it answers a question with content actually drawn from your file — not hallucinated — is a useful moment. You've got a working local RAG pipeline.
Persistent vector storage with Chroma
In-memory storage resets every time Flowise restarts. For anything beyond a one-off demo, add Chroma:
docker run -d \
--name chromadb \
-p 8000:8000 \
chromadb/chroma
In Flowise, replace the In-Memory Vector Store node with Chroma, point it to http://localhost:8000, and set a collection name. Your embeddings now persist between sessions — upsert once, query indefinitely.
If you're running Chroma in Docker alongside Flowise in Docker, make sure both containers are on the same Docker network or use host.docker.internal.
Building an agent (tool use beyond RAG)
Flowise has a separate Agentflow canvas for multi-tool agents. The difference from chatflows: agents can decide which tools to call, not just retrieve and answer.
Useful built-in tools:
- Calculator — LLM math is unreliable; offload it
- Web Browser — Puppeteer-based live browsing
- Custom Tool — point it at any HTTP API you want the agent to call
An agentflow with ChatOllama + a Calculator tool + a web search tool gives you something close to a local ReAct agent. Mistral and Llama 3 handle tool use reasonably well; smaller models (under 7B) tend to struggle with multi-tool decisions.
Flowise vs. the alternatives
| Tool | Interface | Language | Local model support | Best for |
|---|---|---|---|---|
| Flowise | Visual (nodes) | Node.js | Excellent (Ollama native) | RAG + agent prototypes |
| Langflow | Visual (nodes) | Python | Good | Python-first teams |
| n8n | Visual (workflow) | Node.js | Via HTTP nodes | General automation + some AI |
| Dify | Visual + hosted | Python | Good | Teams wanting a managed option |
| LangChain (code) | Code | Python/JS | Full control | Custom production pipelines |
Flowise wins when you want Ollama integration without glue code and you're prototyping quickly. Langflow is the closest competitor — essentially the same concept but Python-native and with a slightly different node model. n8n is better at non-AI automation; its AI nodes feel bolted on compared to Flowise's purpose-built design.
For a deeper comparison of all three in production workflow scenarios, see Flowise vs n8n vs LangGraph 2026.
Top comments (0)