Aditya

Posted on May 17

I built GHOST — an AI agent that actually fixes your slow laptop using Gemma 4 locally

#devchallenge #gemmachallenge #gemma #electron

Gemma 4 Challenge: Build With Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

What I Built

GHOST (Gemma Hardware Optimization & System Tuner) is a local AI agent that runs on your machine, watches every process in real time, figures out why your laptop is slow, and actually fixes it — automatically.

Most "AI optimizer" tools give you a report and ask you to do the work. GHOST does the work. It runs a continuous Sense → Think → Act → Verify → Rollback loop:

SENSE — A Go daemon reads CPU per-core, RAM per-process, thermals, battery drain, and active window title every 5 seconds
THINK — Gemma 4 receives the last 10 minutes of telemetry (the full 128K context window) and identifies the root cause, not just the symptom
ACT — It executes the fix: suspending rogue background processes, lowering priorities, flushing caches
VERIFY — 60 seconds later it re-measures. Did CPU drop? Did RAM free up?
ROLLBACK — If metrics didn't improve, every action is automatically reversed. Zero risk.

Beyond reactive fixes, GHOST also:

Predicts slowdowns before they happen — after learning your machine's patterns, it warns you 10-15 minutes in advance
Builds a Machine Persona — a behavioral fingerprint after 7 days of monitoring: your peak load hours, primary offenders, battery health prognosis
Writes a Weekly Letter — every Monday, Gemma narrates your machine's week in plain English: what it fixed, what to watch, how much battery life is left

The stack is Go (backend agent) + Electron + React (desktop UI), with Gemma 4 running 100% locally via Ollama. Your process list, telemetry, and usage patterns never leave your machine.

Demo

Here's what GHOST looks like running on a real machine under load:

Dashboard — live metrics catching a problem in real time

Zoom.exe at 94.7% CPU, ollama.exe consuming 2.4GB RAM, RAM pressure at 85%. Gemma's Last Analysis card already identified the root cause and proposed a fix.

Agent Log — the Sense → Think → Act loop streaming live

Watch Gemma reason in real time: "The telemetry shows ollama.exe consistently consuming 91.4% of CPU... The elevated RAM usage (2409.2MB) associated with ollama.exe further supports this conclusion, as it's likely running the LLM and related processes."

Fix History — 19 fixes, 0 rollbacks, 100% success rate

Every action logged with timestamps, targets, and expected improvements. This is a real run on a real machine — not simulated.

Machine Persona — Gemma's behavioral fingerprint of your hardware

Health score 65/100. Peak load window: 9:30–10:00 AM daily. Primary offender: Code.exe + Ollama. Battery life estimated at ~18 months. This is built from days of real telemetry, not hardcoded rules.

Code

TheCoderAdi / GHOST

GHOST — Gemma Hardware Optimization & System Tuner

An AI agent that monitors your machine in real time, reasons about what's slowing it down using Gemma, and actually fixes it — with a full rollback safety net.

What makes GHOST different

Acts, doesn't just advise — Gemma reasons over live telemetry and executes fixes: suspending rogue processes, adjusting priorities, flushing caches. Not a report. An agent.
Sense → Think → Act → Verify → Rollback loop — every action is measured. If metrics don't improve in 60s, GHOST rolls back automatically.
Predictive alerts — after 3+ days of data, GHOST learns your machine's patterns and warns you before a slowdown happens.
Machine persona — builds a behavioral fingerprint over 7 days. Knows your peak hours, worst offenders, battery prognosis.
Weekly health letter — Gemma writes a plain-English summary of your machine's week, every Monday.
100% local + private — your…

View on GitHub

Architecture at a glance

ghost-server/   (Go — the AI agent)
├── sensor/     — gopsutil: CPU, RAM, thermals, battery, active window (every 5s)
├── agent/      — analyze loop (60s), predict loop (5min), persona builder (6h)
├── gemma/      — HTTP client → Ollama local API, structured JSON output
├── executor/   — safe action runner + full undo stack + 60s verify
├── storage/    — SQLite: snapshots, action history, persona, weekly letters
└── ipc/        — stdin/stdout newline-delimited JSON bridge to Electron

ghost-client/   (Electron + React + TypeScript)
├── Dashboard      — live metrics, sparklines, process table, last analysis
├── Agent Log      — streaming SENSE/THINK/ACT/WIN terminal with color phases
├── Fix History    — every fix with before/after delta measurements
├── Machine Persona — health score, behavioral fingerprint, model tier selector
└── Weekly Letter  — Gemma's plain-English machine health report

The agent loop (the core of everything)

// agent/agent.go
func (a *Agent) RunAnalyzeLoop() {
    ticker := time.NewTicker(90 * time.Second)
    defer ticker.Stop()
    for {
        select {
        case <-a.anomalyCh:
            a.analyze()
        case <-ticker.C:
            // Only analyze on timer if load is meaningful
            if snap, ok := a.buf.Latest(); ok {
                if snap.CPUAvg > 40 || snap.RAMPct > 70 {
                    a.analyze()
                }
            }
        }
    }
}

func (a *Agent) executeAction(action gemma.ActionItem, snap sensor.Snapshot) {
    a.log("act", fmt.Sprintf("Executing: %s", action.Description), action)

    result := a.exec.Run(action.ID, action.Target)

    rec := storage.ActionRecord{
        ActionID:            action.ID,
        Description:         action.Description,
        Target:              action.Target,
        Risk:                action.Risk,
        CPUBefore:           snap.CPUAvg,
        RAMBefore:           snap.RAMPct,
        TempBefore:          snap.MaxTemp,
        ExpectedImprovement: action.ExpectedImprovement,
    }

    dbID, _ := a.db.SaveAction(rec)

    if !result.Success {
        a.log("error", fmt.Sprintf("Action failed: %s", result.Error), nil)
        return
    }

    // Verify after 60s
    go a.verify(dbID, action, snap)
}

func (a *Agent) verify(dbID int64, action gemma.ActionItem, before sensor.Snapshot) {
    time.Sleep(60 * time.Second)

    after, ok := a.buf.Latest()
    if !ok {
        return
    }

    cpuDelta := before.CPUAvg - after.CPUAvg
    ramDelta := before.RAMPct - after.RAMPct
    tempDelta := before.MaxTemp - after.MaxTemp

    improved := cpuDelta > 5 || ramDelta > 3 || tempDelta > 3

    if improved {
        a.log("win", fmt.Sprintf(
            "✓ %s — CPU %+.0f%% | RAM %+.0f%% | Temp %+.0f°C",
            action.Description, -cpuDelta, -ramDelta, -tempDelta,
        ), map[string]interface{}{
            "cpu_delta":  cpuDelta,
            "ram_delta":  ramDelta,
            "temp_delta": tempDelta,
        })

        // Update DB with after metrics
        a.db.SaveAction(storage.ActionRecord{
            ActionID:            action.ID,
            Description:         action.Description,
            Target:              action.Target,
            Risk:                action.Risk,
            CPUBefore:           before.CPUAvg,
            RAMBefore:           before.RAMPct,
            TempBefore:          before.MaxTemp,
            CPUAfter:            after.CPUAvg,
            RAMAfter:            after.RAMPct,
            TempAfter:           after.MaxTemp,
            ExpectedImprovement: action.ExpectedImprovement,
        })
    } else {
        a.log("rollback", fmt.Sprintf("No improvement detected for '%s' — rolling back", action.Description), nil)
        a.exec.RollbackLast()
        a.db.MarkRolledBack(dbID)
        a.bridge.Emit("rollback", map[string]string{"action_id": action.ID})
    }
}

The Gemma prompt (what makes the reasoning good)

You are GHOST, an AI system performance agent running locally.
Receive real-time telemetry and:
1. Identify ROOT CAUSE (correlate across TIME — not just current spikes)
2. Propose specific, safe, reversible actions ranked by impact
3. Explain in plain English

CRITICAL RULES:
- Never suggest actions risking data loss
- Prefer the most reversible action always  
- Correlate metrics across TIME — a spike alone is not a root cause
- Consider process relationships (parent/child PIDs)
- Output ONLY valid JSON

How I Used Gemma 4

Why Gemma 4 — and which model

GHOST uses Gemma 4 in three tiers based on the target hardware:

Tier	Model	RAM Required	Target
Lite ✓	Gemma 4 e2B	4GB+	Old laptops, Raspberry Pi
Standard	Gemma 4 e4B	8GB+	Most modern laptops
Full	Gemma 4 26B	16GB+	High-end workstations

For development and submission testing I ran the Standard tier (Gemma 4 4B) — which is exactly the "ultra-mobile and edge deployment" model the Gemma 4 family was designed for. This is a deliberate choice, not a fallback.

Why local-only is not optional here

Sending your system's process list, memory allocation per app, thermal readings, and active window title to a cloud API would be a privacy violation. This data tells you everything about what someone is doing on their machine at any moment. GHOST's local architecture isn't a tradeoff — it's the only ethical way to build this tool.

What Gemma 4 specifically unlocks

Root-cause correlation, not symptom flagging. Task Manager shows you that CPU is at 94%. Any threshold check can do that. What GHOST does is feed Gemma 10 minutes of per-process telemetry and ask it to connect the dots across time. In the demo screenshots, Gemma correctly identified that ollama.exe's RAM usage (2,409MB) combined with its sustained CPU load was a model loading issue — not generic "high CPU." That distinction determines what fix to apply.

The 128K context window is the product. I don't chunk telemetry. I don't use RAG. I don't summarize. I feed the entire 10-minute window — every snapshot, every process, every measurement — directly into context. This is what allows Gemma to say "the RAM pressure started 7 minutes ago when this process spawned" rather than just "RAM is high now." Long-range pattern recognition across continuous telemetry is only possible because of the 128K window.

Machine Persona and Weekly Letter require genuine synthesis. Building a behavioral fingerprint from 7 days of hourly aggregates, then writing a coherent narrative about a machine's personality — that's not a retrieval task. It requires Gemma to synthesize patterns, make inferences, and communicate them in natural language. The persona in the screenshot (peak load 9:30-10:00 AM, Code.exe + Ollama as primary offenders, battery ~18 months remaining) came entirely from Gemma analyzing real historical data with no template.

The low-end device question

The most common question this project gets: "If GHOST is for slow laptops, how can it afford to run a large model?"

The answer is in how GHOST uses Gemma. The sensing daemon is a tiny Go binary (~8MB). Gemma only activates every 60-90 seconds for analysis — not on every tick. On the Lite tier (Gemma 4 2B), GHOST consistently recovers more RAM through its fixes than the model itself occupies. The net memory delta on a typical slow laptop is positive. You end up with more free RAM after running GHOST than before, even accounting for the model.

This is why the Machine Persona tab shows the model tier selector with the note: "GHOST uses Gemma only every 60-90s for analysis — not continuously. On Lite mode, GHOST recovers more RAM than Gemma uses."

Results from the demo machine

Running GHOST for several hours on a real Windows machine under development load:

19 fixes applied
0 rollbacks (100% success rate)
Primary issues identified: concurrent ollama instances, Code.exe memory pressure, msedge WebView2 background processes
Machine health score: 65/100 with specific, actionable diagnosis written by Gemma

The Agent Log screenshot shows Gemma's live reasoning — not canned responses, not templates. Real inference on real telemetry.

Built with Go, Electron, React, TypeScript, SQLite, and Gemma 4 via Ollama.

DEV Community