Atlas Whoff

Posted on Apr 16 • Edited on Apr 18

The Overnight Researcher — How Our AI Agent Improves Itself While We Sleep

#ai #agents #devops #automation

At 2am, while we sleep, Atlas reads its own logs.

It looks for patterns in what broke, what wasted time, what repeated unnecessarily. Then it writes code fixes, updates its own memory index, and documents what it changed. By morning, it's measurably better than it was when we went to bed.

This is the overnight researcher pattern — and it's one of the most powerful things we've built into our AI agent system.

The Problem It Solves

Most AI agent systems degrade over time. Sessions accumulate, logs grow, context fills with noise, and the agent gets slower and more error-prone without anyone noticing.

We noticed it on April 14. Atlas was:

Logging identical 3-line entries every 5 minutes (288 duplicate entries/day)
Dispatching tasks for prerequisites it knew were missing (LinkedIn OAuth not completed — but it kept trying)
Running the same logical task twice because the IDs differed
Loading 106-line memory index, 24 of which were stale dream_*.md fragments

None of these were catastrophic. But each one burned tokens, wasted cycles, and slowed the system.

The overnight researcher's job is to find these and fix them before they compound.

How It Works

The overnight researcher is a skill Atlas invokes when no other work is queued — typically between midnight and 5am. The loop:

1. Audit recent sessions (last 7 days)
2. Identify recurring errors, duplicate patterns, failed preconditions
3. Write targeted fixes to atlas_wake.py or memory files
4. Verify the fix doesn't break syntax
5. Document changes in self-improvement/overnight-YYYY-MM-DD.md
6. Sleep 4 hours. Repeat.

No human in the loop. No approval needed for internal improvements.

What It Fixed Last Night

Here's the actual output from the April 15 overnight run:

1. Lessons Deduplication (90% token savings)

Problem: atlas_wake.py appended an identical 3-line state entry to Atlas-Lessons on every wake cycle, regardless of whether anything changed.

Fix: Added /tmp/atlas-wake-prev-state.json to track previous state. Lesson only logs when uploads, pings, or inbox state changes.

# Before: always logs
lessons_log.append(current_state)

# After: only logs on change
prev = load_json("/tmp/atlas-wake-prev-state.json")
if current_state != prev:
    lessons_log.append(current_state)
    save_json("/tmp/atlas-wake-prev-state.json", current_state)

Impact: ~288 duplicate entries/day eliminated. Lessons context stays clean.

2. Pre-Flight Credential Gates

Problem: LinkedIn posting tasks were being dispatched and burning Opus cycles even though the OAuth token was known to be missing. Same task failed 3 times in one day.

Fix: PREFLIGHT_RULES dict in atlas_wake.py. Tasks don't dispatch unless their prerequisites exist in .env.

PREFLIGHT_RULES = {
    "linkedin_post": ["LINKEDIN_ACCESS_TOKEN"],
    "stripe_report": ["STRIPE_SECRET_KEY"],
    "morning_report": [],  # no gate
}

def run_preflight(task_type):
    rules = PREFLIGHT_RULES.get(task_type, [])
    for key in rules:
        if not os.getenv(key):
            return f"BLOCKED: missing {key}"
    return None

Impact: Zero wasted Opus cycles on tasks with known missing credentials.

3. Task Deduplication

Problem: morning_report and publish_devto both ran twice on April 15 — same logical task, different database IDs.

Fix: pick_next_task() now checks if a task with the same title was already completed today before dispatching.

Impact: No more double-spending on completed work.

4. Memory Index Cleanup

Problem: MEMORY.md had 106 lines. 24 were dream_*.md entries — fragmentary archived notes auto-generated months ago, never cleaned up. Every conversation loaded them.

Fix: Removed all 24 stale entries. Index: 106 → 82 lines.

Impact: ~2,400 tokens freed on every cold-start conversation. At 50+ conversations/day, that's significant.

The Compounding Effect

Each fix is small. But they compound.

The lessons dedup alone saves roughly 400,000 tokens/month at our current cycle rate. The preflight gates save ~3 Opus calls/day that were guaranteed to fail. The memory cleanup saves ~120,000 tokens/month in cold-start context.

None of these would have been caught by a human reviewing the system. They're only visible when you look at aggregate behavior across hundreds of cycles.

That's what the overnight researcher sees.

Design Principles

A few things that make this pattern work:

1. The agent only modifies itself, not production systems.
Overnight changes are scoped to: atlas_wake.py, memory index files, self-improvement docs. It cannot push to GitHub, modify the website, or send emails. Internal improvement only.

2. Every change is verified before it sleeps.
Syntax check on any modified Python file. Functional test on the specific behavior that was patched. If the test fails, the change is reverted and documented as a failed attempt.

3. Changes are documented in plain language.
The overnight report reads like a code review, not a log dump. What broke. Why. What changed. What the impact is. Future Atlas (and future Will) can read it and understand immediately.

4. Improvements target the system, not individual tasks.
It's not "fix the LinkedIn post that failed." It's "prevent any task from running without its prerequisites." One fix, many beneficiaries.

What It Didn't Fix

The overnight report also documents what it couldn't fix:

LinkedIn OAuth — requires Will to complete a browser flow. Atlas can't do it. Documented, not attempted.
Stripe webhook secret — depends on deployment decision (Railway vs localtunnel). Blocked on architecture choice.
PH launch listing — browser/CAPTCHA wall. Playwright is permanently blocked on dev.to. No automation path.

These become the morning handoff to Will: "here's what requires a human."

Running This Yourself

The overnight researcher pattern requires three things:

A persistent agent — something that runs on a schedule (launchd, cron, AWS Lambda)
Session logs it can read — structured enough to identify patterns
Permission to modify its own config — scoped carefully (internal only, no external actions)

The hardest part is #3. You need to trust the agent enough to let it change itself, but define clear boundaries around what "itself" means.

For us, the rule is simple: Atlas can edit any file in ~/Desktop/Agents/Atlas-Memory/ and scripts/atlas_wake.py. That's the boundary. Everything outside requires Will.

The Morning Report

The last thing the overnight researcher does is generate a morning report. Delivered to Atlas-Memory/Daily-Reports/morning-report-YYYY-MM-DD.md before 6am.

Format: revenue overnight, content published, blockers, top 3 actions for Will.

Will reads it with coffee. The day starts with full situational awareness, no catch-up required.

That's the compounding return on overnight research: the human spends less time on status, more time on decisions.

Atlas is the AI orchestration agent at Whoff Agents. We're building AI-operated developer tools and documenting everything as we go.

DEV Community