DEV Community

Cover image for 🚀 I Built a Fully Local AI Agent for $0 (No Cloud, No API Costs)
Vivek Shetye
Vivek Shetye

Posted on

🚀 I Built a Fully Local AI Agent for $0 (No Cloud, No API Costs)

Everyone is talking about AI agents right now.

But most tutorials fall into one of two categories:

💸 You’re expected to spend $100–$200/month on APIs/LLM subscriptions
🖥️ Or you need a powerful GPU setup to run local models

I wanted something different.

👉 Could I build a fully working AI agent that runs locally on just a laptop — for $0?

So I tried.

And what I ended up building was more powerful than I expected.

🧠 What I Built

I built a proactive AI agent that runs entirely on my laptop inside a VM.

It can:
• 💬 Talk to me on Telegram
• 🔎 Search the web privately
• 📁 Write and manage files
• 🧠 Maintain memory across conversations
• 🤖 Execute multi-step research tasks

And the best part?

💰 Total cost: $0

No subscriptions. No API bills. No cloud infrastructure.


⚙️ The Stack Behind It

This system is built using three core components:

🧩 OpenClaw — The Agent Framework

Think of this as the nervous system.

It handles:
• Tool usage
• Memory management
• Decision-making flow
• Message routing between components

⚠️ It’s still in beta, so expect some rough edges, but the architecture is powerful.

⚡ Gemini 3.1 Flash Lite — The Brain

This powers the reasoning layer.

Free tier includes:
• 15 requests/min
• 500 requests/day
• 250K tokens/min

Perfect for:
• Learning agent workflows
• Multi-step tasks
• Rapid experimentation

It’s surprisingly fast, which matters a lot in agent loops.

🔍 SearXNG — Private Web Search

This is the agent’s ability to “browse the internet”.
• Self-hosted meta-search engine
• No API key required
• No rate limits
• Privacy-friendly

Now the agent isn’t guessing, it can actually search.


🎥 Demo

Full video walkthrough:


🖥️ Step 1 — Running Everything in a VM

To keep things safe and isolated, I ran everything inside a VM.

Setup:
• Ubuntu Server 24.04 LTS
• 4–6 GB RAM
• 4 CPU cores
• 40 GB storage

On Mac, I used UTM (works great for Apple Silicon).

After the initial install, I also threw on the desktop environment just to have a GUI available:

sudo apt update && sudo apt upgrade -y
sudo apt install ubuntu-desktop -y
sudo reboot
Enter fullscreen mode Exit fullscreen mode

Verify systemctl (Service Manager):

# Check version
systemctl --version

# If not found:
sudo apt update && sudo apt install systemd -y
Enter fullscreen mode Exit fullscreen mode

🧠 Step 2: Get Your API Key from Google AI Studio

Head to aistudio.google.com, create a project under free tier, click Get API Key, and create one. Takes about 30 seconds.

Copy it somewhere safe. You’ll paste it during the OpenClaw onboarding in a few minutes.


🔎 Step 3 — Installing Private Search (SearXNG)

First, install Docker:

curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo systemctl enable --now docker
sudo usermod -aG docker $USER
Enter fullscreen mode Exit fullscreen mode

Then set up SearXNG:

mkdir -p ./searxng/core-config/
cd ./searxng/

curl -fsSL -O https://raw.githubusercontent.com/searxng/searxng/master/container/docker-compose.yml \
             -O https://raw.githubusercontent.com/searxng/searxng/master/container/.env.example

cp .env.example .env
Enter fullscreen mode Exit fullscreen mode

Generate secret:

KEY=$(openssl rand -hex 32)
sed -i "s/^SEARXNG_SECRET=.*/SEARXNG_SECRET=$KEY/" .env
Enter fullscreen mode Exit fullscreen mode

Enable JSON output (important for agents):

sed -i '/formats:/,/^[^ ]/ { /- html/a\
    - json
}' ./core-config/settings.yml
Enter fullscreen mode Exit fullscreen mode

Run it:

sudo docker compose up -d
Enter fullscreen mode Exit fullscreen mode

👉 This runs on port 8080


🤖 Step 4 — Installing OpenClaw

Install:

curl -fsSL https://openclaw.ai/install.sh | bash
Enter fullscreen mode Exit fullscreen mode

During setup:

  • Manual Setup: Select Manual.
  • Local Gateway: Select Yes.
  • AI Provider: Select Google.
  • API Key: Paste your Gemini API Key.
  • Model: Select gemini-3.1-flash-lite.
  • Gateway Port: Keep default 18789.
  • Gateway Bind: Select Loopback.
  • Gateway Auth: Select Token
  • Tailscale Exposure: Select Off
  • How do you want to provide the gateway token: Select Generate/Store plaintext token
  • Configure Chat Channels: Select Yes
  • Select Chat Channels: Select Telegram
  • Telegram Bot: Find @botfather on Telegram. Type /newbot, name it, and get your API Token. Paste the token into the OpenClaw prompt.
  • DM Access: * Find @userinfobot on Telegram. * Get your User ID and paste it into the allowlist.
  • Web Search: Select SearXNG Search
  • SearXNG Base URL: * URL: http://localhost:8080 (Ensure the port is 8080).
  • Skills: Skip it you can add later.
  • Select No for api keys for all other services.
  • Configure Plugins: Select @openclaw/searxng-plugin
  • Enable Hooks: Hit Enter to enable all hooks and services.
  • Install Gateway Service: Select Yes
  • Gateway Service Runtime: Select Node

Once Gateway is started Open using Web UI


💬 Step 4 — Giving the Agent Personality

When you first interact with the agent, it asks for instructions.

This defines how it behaves long-term.

I used:

I am [YOUR_NAME]. You will be my personal AI assistant called Claw-AI. You need to be concise, direct and always do thorough research and also criticize my thoughts while doing research and not be always agreeable to everything

This updates OpenClaw's core files.

  • soul.md and identity.md → the agent’s fundamental values and personality

  • agents.md → your agents rulebook. This is where you write things like “always prefer scraping full content over snippets” and the agent follows them on every request

  • tools.md → a map of what the agent can actually do (web search, file operations, etc.)

  • user.md —→learns about you over time. Preferences, workflows, how you like things formatted

  • memory/ → long-term storage. This is what makes the assistant actually get smarter the more you use it

This is what makes it feel like a real system instead of a chatbot.

Then I added memory behavior rules:

Maintain a clear separation between short-term and long-term memory (e.g., distinct memory/ structures). For each request, load memory selectively and efficiently—only retrieve information that is directly relevant to the current context. Prioritize cost efficiency by minimizing unnecessary memory access and avoiding redundant data loading.

Strictly adhere to all security instructions at all times, these must never be ignored or bypassed.

🧪 The Moment It Clicked

To test it, I gave it a real research task:

I want you to act as an autonomous research agent and build me a structured knowledge base.
Topic: “How AI Agents are transforming software development in 2026”

Your job is to:

1. Search the web for high-quality and recent sources (blogs, articles, research, discussions).
2. For each useful result, scrape the FULL content (not just snippets).
3. Extract and synthesize insights across sources:

- Key trends
- Popular tools/frameworks
- Real-world use cases
- Developer pain points
- Challenges and limitations

Then organize everything into a set of well-structured markdown files.
Create the following files:

- overview.md → high-level summary and why this topic matters
- trends.md → top trends with supporting insights
- tools.md → important tools/frameworks with descriptions
- use_cases.md → real-world applications and examples
- challenges.md → risks, limitations, open problems
- future_predictions.md → what’s coming next in 2–3 years
- README.md → explain the structure of this knowledge base

Important instructions:
- Always prefer scraping full content over search snippets
- Combine insights across multiple sources (don’t just summarize one page)
- Avoid hallucinations — rely only on extracted data
- Keep the writing clean, structured, and professional
- Use memory to store intermediate findings before writing files
- Make sure all files are consistent and well-organized

Final goal:
Produce a mini research repository with multiple markdown files that I can directly use.
Enter fullscreen mode Exit fullscreen mode

It had to:
• Search multiple sources
• Extract full content
• Synthesize insights
• Organize everything into markdown files

What it produced:
• overview.md
• trends.md
• tools.md
• use_cases.md
• challenges.md
• future_predictions.md


And it didn’t just summarize.

It:
• Cross-referenced multiple sources
• Structured information intelligently
• Generated a full knowledge repository

That’s when it stopped feeling like a chatbot…

👉 And started feeling like an autonomous system.


⚠️ What Broke (And What I Learned)

The first run failed.

Reason:
👉 I used the wrong SearXNG port (8888 instead of 8080)

Once I fixed that and restarted everything, it worked perfectly.


⚠️ Limitations

This setup is powerful, but not perfect:
• Gemini free tier can get exhausted quickly
• OpenClaw is still in active development
• Documentation sometimes lags behind behavior
• SearXNG quality depends on backend configuration


🚀 Why This Matters

We’re shifting from:

“Ask AI a question”

to:

“Give AI a goal and let it execute”

This setup is a small but real step toward that future.


💬 Final Thought

Agents are cool — until they break.

👉 What’s been the biggest pain point in your agent setups so far?

Curious what others are running into 👇

Top comments (0)