If you're already running local models with Ollama, you've probably hit a few friction points:
- CLI isn’t always ergonomic: as your model list grows, switching models and remembering parameters becomes tedious.
- Conversations get messy: different tasks end up in one thread and context becomes hard to reuse.
- Advanced capabilities feel fragmented: vision, file context, and reasoning (Thinking/Chain-of-Thought) often require extra setup.
OllaMan is built to remove that friction: a desktop chat client for Ollama that makes "connect a model and start chatting" feel effortless, stable, and fast.
What is OllaMan?
OllaMan is a desktop client made specifically for Ollama users. It provides a clean GUI to manage local models, chat with them in real time, and connect to multiple Ollama servers (macOS / Windows / Linux).
Chat is a first-class feature:
- Multi-agent (roles): create agents for different workflows
- Multi-session: keep context organized per agent
- Attachments: send files/images as context
- Thinking Mode: show collapsible reasoning for supported models
- Message operations: edit messages, regenerate AI responses, copy with one click
- Performance stats: live tokens/s, duration, total tokens, and shareable cards
1) Connect to Ollama: local or multi-server
OllaMan can connect to multiple Ollama instances, which is useful when:
- You run small models locally (offline-first)
- You host larger models on a stronger machine (LAN/remote)
- You want to separate environments (work vs personal)
Recommended flow:
- Make sure your Ollama service is running (local or remote).
- Open OllaMan and go to Settings → Servers.
- Add server details (name, URL; optionally username/password).
- Run a connection test to verify latency and health.
2) Pick a model and start chatting
On the Chat page, you can switch models from the top toolbar:
- Open the model dropdown
- Choose from your locally installed models
- The selection applies immediately to subsequent messages
OllaMan also detects capabilities automatically:
- Vision: shows the image attachment button
- Thinking: shows the Think toggle
3) Use Agents to turn workflows into one-click presets
An Agent is a pre-configured assistant role. Think of it as a reusable card with its own default model, system prompt, and generation parameters.
Built-in agents include:
- OllaMan: pinned default agent (cannot be removed)
- Frontend Dev: a pre-tuned agent for frontend development
To create your own agent:
- Click “+” in the left sidebar
- Set name, icon, and description
- Configure default model, system prompt, and parameters (Temperature / Top P / Top K)
- Drag to reorder
Tip: Create a small set of high-quality agents for your most common workflows, and keep names consistent so they’re easy to maintain.
4) Sessions: keep context clean and searchable
Each agent can have multiple independent sessions. Sessions are grouped by time:
- Today
- This Week
- Earlier
Common actions:
-
New session: click "New Chat" or press
Cmd+N/Ctrl+N - Switch: click a session in the list
- Delete: hover and remove
Session titles are generated from the first message to help you quickly recognize topics.
5) Attachments: put files and images directly into context
This is one of the most practical everyday features.
File attachments (text)
Send code, docs, logs, or configs as context:
- Supports TXT / MD / JSON / JS / TS / Python / HTML / CSS and other text formats
- Click the file card to preview full content with syntax highlighting
Great for:
- Code review
- Document understanding
- Debugging configurations
Image attachments (vision models)
When using vision-capable models (e.g., LLaVA, Gemma2 Vision), you can attach images:
- Formats: PNG / JPG / JPEG / GIF / WebP
- Thumbnail preview before sending, with removal support
6) HTML Code Preview: instant visual feedback for HTML snippets
When the model generates HTML code, OllaMan provides instant preview capability:
- For HTML code blocks, a Preview button appears in the top-right corner of the code block
- Click to open a preview window that renders the HTML in real-time
- Great for testing UI snippets, learning HTML/CSS, or validating generated markup
This makes it easy to visualize and iterate on generated HTML without leaving the chat interface.
7) Thinking Mode: collapsible reasoning, separated from the final answer
For models that support reasoning/chain-of-thought (e.g., DeepSeek R1, QwQ), enable Think:
- Reasoning is separated from the final output
- The reasoning block is collapsible
- Useful for complex problem solving and structured planning
8) Session settings: tweak per chat, then optionally "save to agent"
The top-right settings panel lets you adjust session-level parameters:
- System Prompt: session-specific system prompt
- Temperature (0-2): higher is more creative
- Top P (0-1): lower is more focused
- Top K (1-100): limits candidate tokens
You can also:
- Save to Agent: persist current session settings as the agent default
- Reset to Agent Defaults: revert to the agent baseline
9) Performance stats and share cards
During generation, OllaMan shows:
- Tokens/s
- Total Tokens
- Duration
Click the metrics area to open a share card and save it as an image—handy for comparing models, quantization levels, or different machines.
Recommended workflows (best practices)
- Create agents per task: writing, coding, translation, learning
- Keep related questions in the same session for consistent context
- Raise Temperature for creative work (copywriting, brainstorming)
- Lower Temperature for precision (debugging, factual Q&A)
- Use file attachments instead of copy-paste for stability
Closing: make Ollama truly usable in your daily workflow
Ollama makes local LLMs accessible—and OllaMan makes them practical:
- Faster model switching with capability detection
- Cleaner multi-agent / multi-session organization
- Attachments and Thinking Mode that actually fit daily use
- Visible performance metrics you can measure and share
If you're looking for a better Ollama chat client, OllaMan is worth a try.
OllaMan: https://ollaman.com/











Top comments (0)