AK DevCraft

Posted on Jun 8

Telegram Integration - 0$ Personal Agentic AI Assistant - Part 5

#openclaw #llm #machinelearning #ai

Introduction

Part 5 of the Zero Dollar AI Assistant series - Your Personal AI Assistant via Telegram — Always On. Part 1 covers the architecture. Part 2 covers Oracle Cloud setup. Part 3 covers Ollama and model selection. Part 4 covers installing openclaw on Linux

If you've followed the series to this point, you have:

An Oracle ARM server running 24/7 at zero cost - Part 2
Ollama with llama3.2:3b as your local model & Gemini 2.5 Flash as a free fallback - Part 3
OpenClaw gateway running as a systemd service & a Telegram bot configured and waiting - Part 4

This final article covers the last mile — pairing your Telegram account, testing the end-to-end flow, understanding what you've actually built, and setting expectations for daily use.

Pairing Your Telegram Account

OpenClaw uses a pairing system to control who can talk to your agent. Since your bot token is effectively public once the bot exists, pairing ensures only approved accounts get responses.

Start the pairing flow:

Open Telegram on your phone
Search for your bot by username (e.g., @myassistant_bot)
Tap Start
Send any message — /start works

Your bot will reply with a pairing instruction containing an ID. On your server, run:

openclaw pairing approve telegram <id_from_telegram>

Send another message — this time the agent responds.

Hello World!

If you get no response after sending a message:

Check if Telegram polling is active:

journalctl --user -u openclaw-gateway.service -n 30 --no-pager | grep -i "telegram\|polling\|channel"

Look for [telegram] [default] starting provider in the logs. If it's not there, the bot token isn't being picked up — verify it's in your config:

openclaw config get channels.telegram.botToken

Testing the End-to-End Flow

Before relying on this for anything real, verify every layer is working.

Test 1 — Local model responding:

Send a simple message via Telegram:

Just reply! hello!

Expected: Response within 30-60 seconds from the local Llama model.

Test 2 — Gemini fallback working:

Temporarily set a very short timeout to force a fallback:

openclaw config set models.providers.ollama.timeoutSeconds 5
openclaw gateway restart

Send a message — it should fall through to Gemini and respond in 2-3 seconds. Then restore the timeout:

openclaw config set models.providers.ollama.timeoutSeconds 120
openclaw gateway restart

Test 3 — Memory persisting:

Send two messages in separate Telegram sessions:

Session 1: My name is [your name]. Remember this.
Close Telegram, reopen, start a new chat
Session 2: What's my name?

The agent should recall your name from the previous session.

Understanding What You've Built

It's worth being precise about what this stack is and isn't.

What it is:

A persistent, always-on AI agent running on infrastructure you control. It has a consistent identity, maintains memory across conversations, can take actions (web search, file operations, future integrations), and is accessible from your phone via Telegram 24 hours a day without you having to open any app or start any session.

What it isn't:

A replacement for Claude.ai or ChatGPT for heavyweight tasks. The local 3B model has real limitations — complex reasoning, long-document analysis, nuanced writing — in which a frontier model is noticeably better. The Gemini fallback helps, but Gemini 2.5 Flash on the free tier has rate limits and is itself not the most capable model available.

Think of it as a capable personal assistant for everyday tasks, not a research tool for demanding work.

Real-World Performance Expectations

After running this setup daily, here's what to expect:

Task	Model used	Response time	Quality
Quick Q&A	Llama 3.2:3b	20-40 seconds	Good
Short drafts	Llama 3.2:3b	40-90 seconds	Good
Web search query	Gemini fallback + Tavily	5-10 seconds	Very good
Complex reasoning	Gemini fallback	3-5 seconds	Very good
Long document	Gemini fallback	5-10 seconds	Very good
Code review	Gemini fallback	3-8 seconds	Very good

The mental model shift that makes this work day-to-day: stop expecting real-time responses. Send a message, put your phone down, and come back to a response. It's asynchronous communication with a capable assistant, not a chat UI. Once you internalise that, the response times stop feeling slow.

Cost Breakdown

The whole point of this series was zero dollars. Here's the actual accounting:

Component	Cost
Oracle ARM instance (4 OCPU / 24GB)	$0 — Always Free
Oracle boot volume (100GB)	$0 — within 200GB Always Free
Ollama + Llama 3.2:3b	$0 — open source, runs locally
Gemini 2.5 Flash (free tier)	$0 — 1,000 requests/day
Tavily search (free tier)	$0 — 1,000 searches/month
OpenClaw	$0 — open source
Total	$0/month

The only scenario where cost appears is if you add an Anthropic API key for Claude access and don't disable the background startup context (covered in Part 4). With Ollama as the primary and Gemini as the fallback, there's no risk of unexpected charges.

What to Build Next

With the foundation running, a few directions worth exploring:

Custom subagents for specific tasks

OpenClaw supports creating specialized agents — a code reviewer, a research assistant, a writing editor. Each gets its own system prompt and tool access. Once your base setup is stable, this is the most impactful thing to add.

GitHub integration

Install the GitHub CLI properly and enable the GitHub skill:

sudo apt install gh -y
gh auth login
openclaw skills install github

This lets you query issues, review PRs, and get repository summaries via Telegram.

Scheduling and reminders

OpenClaw supports cron-style scheduled tasks. You can set the agent to send you a daily briefing, remind you of recurring tasks, or summarise news on a schedule — all triggered automatically without you initiating anything.

Upgrade the local model

As your comfort with the setup grows, consider pulling llama3.1:8b for better quality responses. The tradeoff is slower response times on complex queries — but with Gemini catching timeouts, the experience degrades gracefully rather than failing hard.

The Bigger Picture

This series started with a simple question: Do you actually need to pay for an AI assistant?

The honest answer in 2026 is no — but with caveats. The zero-cost stack works. It's not as fast or as capable as a paid frontier model for every task. But for an always-on personal assistant that lives in your messaging app, remembers you, and handles most everyday AI tasks, the free tier is genuinely sufficient.

More importantly, you own it. The infrastructure, the configuration, the memory — it's on your server. You're not subject to pricing changes, feature removals, or terms of service updates from an AI provider. For a personal tool you'll use every day, that control has real value beyond the cost savings.

The subscription isn't always wrong. But it's no longer the only option.

This article is the fiveth and final part in a five-part series:

$0 Personal Agentic AI Assistant - Architecture
Setting Up Free Cloud Server — VCN, ARM instances, static IPs, the gotchas
Running Ollama on ARM — model selection, disk management, CPU inference, reality
Installing OpenClaw on Linux — avoiding every trap
The Complete Setup — Telegram, end-to-end testing ← you did it!

Final Thoughts

What's remarkable about this stack isn't that it works today. It's that it gets better without you doing anything.
Local models improve with every release. Oracle's free tier isn't going anywhere. Gemini's free allowance has only grown. OpenClaw adds capabilities with every update.
You've built something that appreciates over time — not a locked-in subscription that charges more as it improves, but infrastructure you control that gets more capable as the ecosystem around it matures.
That's a different relationship with AI than most people have. And it started with a free Oracle account and an afternoon.

If you have reached this point, I have made a satisfactory effort to keep you reading. Please be kind enough to leave any comments or share any corrections.

DEV Community