DEV Community

TechLatest
TechLatest

Posted on • Originally published at faun.pub on

Your AI on WhatsApp — Fully Local, Powered by Gemma

Build a personal AI assistant that answers on Telegram/WhatsApp/CLI using Gemma 4 E2B and delegates research-heavy questions to your local Agentic RAG API.

What you end up with

  1. OpenClaw Gateway — always-on control plane (daemon)
  2. gemma4:e2b — conversational model with tools + optional vision
  3. agentic-rag skill — shells out to rag_query.sh → POST /predict on LitServe
  4. qwen-agentic-rag — CrewAI Researcher + Writer + Qdrant (and optional Firecrawl)

This integration uses one Ollama model everywhere: gemma4:e2b for OpenClaw chat and for the CrewAI RAG agents.

Deploy OpenClaw Without the Setup Hassle

Want to skip the installation and configuration process? We provide a fully managed OpenClaw AI Agent Automation Stack on AWS, Azure, and Google Cloud, complete with OpenClaw, Ollama, dependencies, and optional GPU acceleration already configured. Simply launch the VM and start building AI agents, automation workflows, and local LLM applications immediately. The environment is optimized for performance, securely isolated from your local machine, and designed to get you from deployment to productivity in minutes.

Prerequisites

| Requirement | Check |
|-------------|--------|
| Node **22.12+** or **24** (OpenClaw will not run on Node 20) | `node -v` |
| Ollama | `ollama -v` |
| Python 3.10+ | `python3 --version` |
| curl + jq | `curl --version` && `jq --version` |
Enter fullscreen mode Exit fullscreen mode

Part 1 — Agentic RAG API

If you already finished the Qwen Agentic RAG tutorial, start the server only:

ollama pull gemma4:e2b
cd guides/qwen-agentic-rag
source .venv/bin/activate
cp ../openclaw-gemma-rag/env.rag.example .env # sets OLLAMA_MODEL=ollama/gemma4:e2b
# First time only:
# pip install -r requirements.txt && python setup_vectordb.py
python server.py
Enter fullscreen mode Exit fullscreen mode

Default URL: http://127.0.0.1:8001 (PORT in .env).

Verify:

python client.py --query "What is cross-validation?"
# or
curl -sS -X POST http://127.0.0.1:8001/predict \
  -H 'Content-Type: application/json' \
  -d '{"query":"What is cross-validation?"}' | jq -r .output
Enter fullscreen mode Exit fullscreen mode

Keep this terminal open. The first crew run may take several minutes.

Part 2 — Pull Gemma 4 E2B

ollama pull gemma4:e2b
ollama run gemma4:e2b "Reply in one sentence: what is Gemma 4?"
Enter fullscreen mode Exit fullscreen mode

Recommended sampling (Ollama may already apply defaults): temperature=1, top_p=0.95, top_k=64.

Part 3 — Install OpenClaw

Node version (required)

OpenClaw needs Node >= 22.12. If node -v shows v20, switch with nvm (you may already have 22 installed):

cd guides/openclaw-gemma-rag
source ./use-node22.sh # uses .nvmrc → 22.22.3
node -v # must be v22.12.0 or higher
Enter fullscreen mode Exit fullscreen mode

Optional — make Node 22 the default in new terminals:

nvm alias default 22
Enter fullscreen mode Exit fullscreen mode


npm install -g openclaw@latest
openclaw onboard --install-daemon
Enter fullscreen mode Exit fullscreen mode

Follow prompts for workspace, auth, and optional channels. See Getting started.

Set the primary model:

export OLLAMA_API_KEY="ollama-local"
openclaw models list --provider ollama
openclaw models set ollama/gemma4:e2b
Enter fullscreen mode Exit fullscreen mode

Config snippet

Copy fields from config/openclaw.snippet.json5 in this guide into ~/.openclaw/openclaw.json.

Critical points:

Restart:

openclaw gateway restart
openclaw gateway status
Enter fullscreen mode Exit fullscreen mode

Part 4 — Install the agentic-rag skill

From this guide directory:

cd guides/openclaw-gemma-rag
chmod +x install-skill.sh skills/agentic-rag/scripts/*.sh
./install-skill.sh
Enter fullscreen mode Exit fullscreen mode

This copies to ~/.openclaw/workspace/skills/agentic-rag/.

Alternative (if your CLI supports it):

openclaw skills install ./guides/openclaw-gemma-rag/skills/agentic-rag --global
Enter fullscreen mode Exit fullscreen mode

Enable in config:

{
  skills: {
    entries: {
      "agentic-rag": {
        enabled: true,
        env: { RAG_API_URL: "http://127.0.0.1:8001" },
      },
    },
  },
}
Enter fullscreen mode Exit fullscreen mode

Optional allowlist so only this skill is injected:

{
  agents: {
    defaults: {
      skills: ["agentic-rag"],
    },
  },
}
Enter fullscreen mode Exit fullscreen mode

Restart the gateway after skill or config changes.

Skill behavior

The skill teaches OpenClaw to run:

~/.openclaw/workspace/skills/agentic-rag/scripts/rag_query.sh "user question"
Enter fullscreen mode Exit fullscreen mode

That POSTs to LitServe and prints the crew answer. The Gemma model decides when to use the skill; the RAG crew uses the same OLLAMA_MODEL=ollama/gemma4:e2b from guides/qwen-agentic-rag/.env (see env.rag.example).

Part 5 — End-to-end test

CLI (no channel)

openclaw agent --message "Using the agentic RAG knowledge base: explain cross-validation in 3 bullets." --thinking low
Enter fullscreen mode Exit fullscreen mode

Watch the gateway logs — you should see an exec invoking rag_query.sh.

Manual script test

export RAG_API_URL=http://127.0.0.1:8001
./skills/agentic-rag/scripts/rag_query.sh "What is regularization?"
Enter fullscreen mode Exit fullscreen mode

Health check

./skills/agentic-rag/scripts/rag_health.sh
Enter fullscreen mode Exit fullscreen mode

Part 6 — Connect a channel (optional)

Example: Telegram

  1. Create a bot via @BotFather
  2. During openclaw onboard or openclaw configure, add the Telegram channel token
  3. Keep DM pairing enabled (dmPolicy: "pairing") until you trust exposure
  4. Approve yourself: openclaw pairing approve telegram

Send: “Search the ML FAQ: what is gradient descent?”

Flow: Telegram → Gateway → Gemma → agentic-rag skill → RAG API → reply on Telegram.

Channel docs: OpenClaw Channels.

Security checklist

  • Treat inbound DMs as untrusted — keep pairing on for production-adjacent setups
  • exec (used by the RAG skill) is powerful — do not expose the gateway to the public internet without Security and Exposure runbook
  • Run openclaw doctor after config changes
  • RAG API binds to localhost by default — keep it that way

Troubleshooting

| Symptom | Fix |
|---------|-----|
| `connection refused` on :8001 | Start `python server.py` in qwen-agentic-rag |
| RAG very slow | Normal on laptop; reduce parallel Ollama loads |
| OpenClaw ignores RAG | Confirm skill installed, `enabled: true`, gateway restarted; ask explicitly to "use agentic RAG" |
| `ollama/gemma4:e2b` not found | `ollama pull gemma4:e2b`; check `openclaw models list` |
| Tool calling errors | Ensure `api: "ollama"` and no `/v1` on baseUrl |
| `openclaw requires Node >=22.12.0` | Run `source guides/openclaw-gemma-rag/use-node22.sh` or `nvm use 22` before any `openclaw` command |
| OOM on 16GB Mac | Only run `gemma4:e2b`; quit other Ollama models (`ollama ps`) |
| Skill `curl` fails | `brew install jq` or apt install jq |
Enter fullscreen mode Exit fullscreen mode

What’s next

  • Add your own documents in guides/qwen-agentic-rag/rag_code.py and re-run setup_vectordb.py
  • Publish a second OpenClaw skill for Gradio (ui.py) health checks
  • Route work vs personal agents with multi-agent routing

Summary

| Component | You run |
|-----------|---------|
| Ollama | `gemma4:e2b` (chat + RAG) |
| RAG | `guides/qwen-agentic-rag/server.py` |
| OpenClaw | `openclaw gateway` (daemon) |
| Skill | `agentic-rag` → `rag_query.sh` → `/predict` |
Enter fullscreen mode Exit fullscreen mode

You now have a local-first assistant: Gemma for conversation, CrewAI RAG for grounded ML research — no cloud LLM required for either layer.

Thank you so much for reading

Like | Follow | Subscribe to the newsletter.

Catch us on

Website: https://www.techlatest.net/

Newsletter: https://substack.com/@techlatest

Twitter: https://twitter.com/TechlatestNet

LinkedIn: https://www.linkedin.com/in/techlatest-net/

YouTube:https://www.youtube.com/@techlatest_net/

Blogs: https://medium.com/@techlatest.net

Reddit Community: https://www.reddit.com/user/techlatest_net/


Top comments (0)