Introduction
Part 5 of the Zero Dollar AI Assistant series - Your Personal AI Assistant via Telegram — Always On. Part 1 covers the architecture. Part 2 covers Oracle Cloud setup. Part 3 covers Ollama and model selection. Part 4 covers installing openclaw on Linux
If you've followed the series to this point, you have:
- An Oracle ARM server running 24/7 at zero cost - Part 2
- Ollama with
llama3.2:3bas your local model & Gemini 2.5 Flash as a free fallback - Part 3 - OpenClaw gateway running as a systemd service & a Telegram bot configured and waiting - Part 4
This final article covers the last mile — pairing your Telegram account, testing the end-to-end flow, understanding what you've actually built, and setting expectations for daily use.
Pairing Your Telegram Account
OpenClaw uses a pairing system to control who can talk to your agent. Since your bot token is effectively public once the bot exists, pairing ensures only approved accounts get responses.
Start the pairing flow:
- Open Telegram on your phone
- Search for your bot by username (e.g.,
@myassistant_bot) - Tap Start
- Send any message —
/startworks
Your bot will reply with a pairing instruction containing an ID. On your server, run:
openclaw pairing approve telegram <id_from_telegram>
Send another message — this time the agent responds.
Hello World!
If you get no response after sending a message:
Check if Telegram polling is active:
journalctl --user -u openclaw-gateway.service -n 30 --no-pager | grep -i "telegram\|polling\|channel"
Look for [telegram] [default] starting provider in the logs. If it's not there, the bot token isn't being picked up — verify it's in your config:
openclaw config get channels.telegram.botToken
Testing the End-to-End Flow
Before relying on this for anything real, verify every layer is working.
Test 1 — Local model responding:
Send a simple message via Telegram:
Just reply! hello!
Expected: Response within 30-60 seconds from the local Llama model.
Test 2 — Gemini fallback working:
Temporarily set a very short timeout to force a fallback:
openclaw config set models.providers.ollama.timeoutSeconds 5
openclaw gateway restart
Send a message — it should fall through to Gemini and respond in 2-3 seconds. Then restore the timeout:
openclaw config set models.providers.ollama.timeoutSeconds 120
openclaw gateway restart
Test 3 — Memory persisting:
Send two messages in separate Telegram sessions:
- Session 1:
My name is [your name]. Remember this. - Close Telegram, reopen, start a new chat
- Session 2:
What's my name?
The agent should recall your name from the previous session.
Understanding What You've Built
It's worth being precise about what this stack is and isn't.
What it is:
A persistent, always-on AI agent running on infrastructure you control. It has a consistent identity, maintains memory across conversations, can take actions (web search, file operations, future integrations), and is accessible from your phone via Telegram 24 hours a day without you having to open any app or start any session.
What it isn't:
A replacement for Claude.ai or ChatGPT for heavyweight tasks. The local 3B model has real limitations — complex reasoning, long-document analysis, nuanced writing — in which a frontier model is noticeably better. The Gemini fallback helps, but Gemini 2.5 Flash on the free tier has rate limits and is itself not the most capable model available.
Think of it as a capable personal assistant for everyday tasks, not a research tool for demanding work.
Real-World Performance Expectations
After running this setup daily, here's what to expect:
| Task | Model used | Response time | Quality |
|---|---|---|---|
| Quick Q&A | Llama 3.2:3b | 20-40 seconds | Good |
| Short drafts | Llama 3.2:3b | 40-90 seconds | Good |
| Web search query | Gemini fallback + Tavily | 5-10 seconds | Very good |
| Complex reasoning | Gemini fallback | 3-5 seconds | Very good |
| Long document | Gemini fallback | 5-10 seconds | Very good |
| Code review | Gemini fallback | 3-8 seconds | Very good |
The mental model shift that makes this work day-to-day: stop expecting real-time responses. Send a message, put your phone down, and come back to a response. It's asynchronous communication with a capable assistant, not a chat UI. Once you internalise that, the response times stop feeling slow.
Cost Breakdown
The whole point of this series was zero dollars. Here's the actual accounting:
| Component | Cost |
|---|---|
| Oracle ARM instance (4 OCPU / 24GB) | $0 — Always Free |
| Oracle boot volume (100GB) | $0 — within 200GB Always Free |
| Ollama + Llama 3.2:3b | $0 — open source, runs locally |
| Gemini 2.5 Flash (free tier) | $0 — 1,000 requests/day |
| Tavily search (free tier) | $0 — 1,000 searches/month |
| OpenClaw | $0 — open source |
| Total | $0/month |
The only scenario where cost appears is if you add an Anthropic API key for Claude access and don't disable the background startup context (covered in Part 4). With Ollama as the primary and Gemini as the fallback, there's no risk of unexpected charges.
What to Build Next
With the foundation running, a few directions worth exploring:
- Custom subagents for specific tasks
OpenClaw supports creating specialized agents — a code reviewer, a research assistant, a writing editor. Each gets its own system prompt and tool access. Once your base setup is stable, this is the most impactful thing to add.
- GitHub integration
Install the GitHub CLI properly and enable the GitHub skill:
sudo apt install gh -y
gh auth login
openclaw skills install github
This lets you query issues, review PRs, and get repository summaries via Telegram.
- Scheduling and reminders
OpenClaw supports cron-style scheduled tasks. You can set the agent to send you a daily briefing, remind you of recurring tasks, or summarise news on a schedule — all triggered automatically without you initiating anything.
- Upgrade the local model
As your comfort with the setup grows, consider pulling llama3.1:8b for better quality responses. The tradeoff is slower response times on complex queries — but with Gemini catching timeouts, the experience degrades gracefully rather than failing hard.
The Bigger Picture
This series started with a simple question: Do you actually need to pay for an AI assistant?
The honest answer in 2026 is no — but with caveats. The zero-cost stack works. It's not as fast or as capable as a paid frontier model for every task. But for an always-on personal assistant that lives in your messaging app, remembers you, and handles most everyday AI tasks, the free tier is genuinely sufficient.
More importantly, you own it. The infrastructure, the configuration, the memory — it's on your server. You're not subject to pricing changes, feature removals, or terms of service updates from an AI provider. For a personal tool you'll use every day, that control has real value beyond the cost savings.
The subscription isn't always wrong. But it's no longer the only option.
This article is the fiveth and final part in a five-part series:
- $0 Personal Agentic AI Assistant - Architecture
- Setting Up Free Cloud Server — VCN, ARM instances, static IPs, the gotchas
- Running Ollama on ARM — model selection, disk management, CPU inference, reality
- Installing OpenClaw on Linux — avoiding every trap
- The Complete Setup — Telegram, end-to-end testing ← you did it!
Final Thoughts
What's remarkable about this stack isn't that it works today. It's that it gets better without you doing anything.
Local models improve with every release. Oracle's free tier isn't going anywhere. Gemini's free allowance has only grown. OpenClaw adds capabilities with every update.
You've built something that appreciates over time — not a locked-in subscription that charges more as it improves, but infrastructure you control that gets more capable as the ecosystem around it matures.
That's a different relationship with AI than most people have. And it started with a free Oracle account and an afternoon.
If you have reached this point, I have made a satisfactory effort to keep you reading. Please be kind enough to leave any comments or share any corrections.
Top comments (0)