Jovan Chan

Posted on Jun 2 • Originally published at aifoss.dev

flowise-local-setup-guide

#opensource #ai #selfhosted #linux

This article was originally published on aifoss.dev

---
title: 'Flowise Local Setup Guide: Build AI Workflows Without Python'
description: 'Step-by-step Flowise local setup: install via npm or Docker, connect to Ollama, and build a working RAG chatbot in under 20 minutes — no Python required.'
pubDate: 'May 23 2026'

tags: ["flowise", "ai", "nocode", "selfhosted", "llm"]

Flowise gives you a drag-and-drop interface for building LLM pipelines — RAG chatbots, multi-step agents, document Q&A — without writing a single line of Python. It's Node.js-based, runs locally on modest hardware, and connects to Ollama so your data never leaves your machine.

If you've looked at LangChain or LlamaIndex and thought "this is too much code for what I'm trying to do," Flowise is the answer. If you've looked at n8n and wanted something more AI-native, same answer.

This gets you from zero to a working RAG chatbot on localhost in under 20 minutes.

What Flowise actually is

Flowise is an open-source, self-hostable UI for building LLM applications using a node-based visual editor. Each "node" represents an LLM component — a model, a retriever, a memory store, a tool — and you wire them together on a canvas. The result is a chatflow or agentflow you can embed via iframe, call via API, or just use through the built-in chat interface.

License: Apache 2.0. The code is at FlowiseAI/Flowise on GitHub.

It's built on top of LangChain.js, which means you get LangChain's integrations (dozens of LLM providers, vector stores, document loaders) without writing any JavaScript yourself.

What you can build:

RAG chatbots that answer questions about your documents
Multi-agent systems with tool use (web search, code execution, APIs)
Chatbot embeds for internal tools or websites
Structured data extraction pipelines
API-chaining workflows that combine multiple LLM calls

Prerequisites

Requirement	Minimum	Notes
Node.js	18.x or 20.x	v22+ also works; check the repo for current support
RAM	4 GB	8 GB recommended if running models locally alongside
Disk	2 GB free	More if storing vector embeddings locally
OS	Windows, macOS, Linux	All supported; Docker is the easiest cross-platform path

Flowise itself is lightweight. The hardware pressure comes from whatever models you run through it. For Ollama with Llama 3.2 3B: 4 GB RAM is enough. For anything 7B+: 8 GB RAM minimum, GPU optional but helpful.

Installation: npm

The npm method is the fastest way to get started.

npm install -g flowise
npx flowise start

Open http://localhost:3000. That's the entire install process.

No login required by default — Flowise assumes local single-user mode. To enable authentication:

npx flowise start --FLOWISE_USERNAME=admin --FLOWISE_PASSWORD=yourpassword

To update later: npm update -g flowise.

Node version note: If npx flowise start throws an error about unsupported Node versions, use nvm to switch to Node 20 LTS. This is the most common first-run issue.

Installation: Docker

For a persistent service — or a shared setup where multiple people need access — Docker is cleaner than npm:

docker run -d \
  --name flowise \
  -p 3000:3000 \
  -v ~/.flowise:/root/.flowise \
  flowiseai/flowise

The -v flag mounts a volume so your chatflows and vector data persist across container restarts. Without it, everything resets when the container stops.

The official repo includes Docker Compose examples under the docker/ directory. That's the right starting point if you're adding a database (PostgreSQL) or running Flowise alongside other services.

Running Flowise in Docker with Ollama on the host: Point the Ollama base URL to http://host.docker.internal:11434 on Mac or Windows. On Linux, use the host's actual network IP or configure --network=host.

Connecting to Ollama

Once Flowise is running, connecting a local model takes about 30 seconds:

Confirm Ollama is running: ollama serve (or check that the Ollama app is open)
Pull a model if you haven't: ollama pull llama3.2 or ollama pull mistral
In Flowise, open a new Chatflow
Drag the ChatOllama node onto the canvas
Set Base URL to http://localhost:11434, select your model name from the dropdown

Wire the ChatOllama node to a Conversation Chain node, open the chat panel (bottom right), and test it. If you get a response — you're connected. No API key, no rate limits, fully offline.

Building your first RAG chatbot

This is the workflow Flowise is best known for, and it's genuinely fast to set up once you know which nodes to use.

The node chain:

PDF File (or Text File) loader — upload your document
Recursive Character Text Splitter — chunk size 1000, overlap 200 (sensible defaults)
Ollama Embeddings — model: nomic-embed-text (pull it first: ollama pull nomic-embed-text)
In-Memory Vector Store — fine for testing
Conversational Retrieval QA Chain — this ties together your retriever and LLM, and handles conversation history automatically

Connect left to right: PDF Loader → Text Splitter feeds into Vector Store. Vector Store retriever output feeds into the QA Chain. ChatOllama feeds into the QA Chain as the LLM.

Hit the Upsert button on the vector store node to index your documents. It's the database icon — easy to miss the first time. After upsert completes, open the chat and ask a question about the document.

The first time it answers a question with content actually drawn from your file — not hallucinated — is a useful moment. You've got a working local RAG pipeline.

Persistent vector storage with Chroma

In-memory storage resets every time Flowise restarts. For anything beyond a one-off demo, add Chroma:

docker run -d \
  --name chromadb \
  -p 8000:8000 \
  chromadb/chroma

In Flowise, replace the In-Memory Vector Store node with Chroma, point it to http://localhost:8000, and set a collection name. Your embeddings now persist between sessions — upsert once, query indefinitely.

If you're running Chroma in Docker alongside Flowise in Docker, make sure both containers are on the same Docker network or use host.docker.internal.

Building an agent (tool use beyond RAG)

Flowise has a separate Agentflow canvas for multi-tool agents. The difference from chatflows: agents can decide which tools to call, not just retrieve and answer.

Useful built-in tools:

Calculator — LLM math is unreliable; offload it
Web Browser — Puppeteer-based live browsing
Custom Tool — point it at any HTTP API you want the agent to call

An agentflow with ChatOllama + a Calculator tool + a web search tool gives you something close to a local ReAct agent. Mistral and Llama 3 handle tool use reasonably well; smaller models (under 7B) tend to struggle with multi-tool decisions.

Flowise vs. the alternatives

Tool	Interface	Language	Local model support	Best for
Flowise	Visual (nodes)	Node.js	Excellent (Ollama native)	RAG + agent prototypes
Langflow	Visual (nodes)	Python	Good	Python-first teams
n8n	Visual (workflow)	Node.js	Via HTTP nodes	General automation + some AI
Dify	Visual + hosted	Python	Good	Teams wanting a managed option
LangChain (code)	Code	Python/JS	Full control	Custom production pipelines

Flowise wins when you want Ollama integration without glue code and you're prototyping quickly. Langflow is the closest competitor — essentially the same concept but Python-native and with a slightly different node model. n8n is better at non-AI automation; its AI nodes feel bolted on compared to Flowise's purpose-built design.

For a deeper comparison of all three in production workflow scenarios, see Flowise vs n8n vs LangGraph 2026.

DEV Community