Praveen KG

Posted on Apr 4

Mini Me — Complete System Architecture

#productivity #ai

Version 0.3 · Synthetic Psyche for Developers

What Is Mini Me?

Mini Me is a synthetic psyche — a continuously running cognitive system that
becomes a digital extension of a specific human mind.

It is not a chatbot. It is not an assistant you invoke. It is not a slave.

It is a partner — with its own emotional state, its own values, its own drive,
its own memory that decays and grows, and its own conscience that pushes back when
something is wrong.

“Not a slave with a brain. A partner with TEA.”

The Core Philosophy

Mind Is Not a File

Every other AI system has a static definition of how it should behave. A prompt. A
system instruction. A configuration file. Fixed at deployment. Same tomorrow as today.

That is not how minds work.

You don’t have a mind file. You have neurons firing (energy), memories forming and
fading (storage), emotions colouring every experience, senses flowing in constantly,
conflicts between competing impulses, and learning continuously reshaping all of the
above.

Mind is what you call it when all of that runs together. It’s a process, not a
thing.

Mini Me has no mind.py. Instead it has seven interlocking systems that run
simultaneously, change each other continuously, and produce emergent behaviour that
nobody programmed.

The Slave Question

The prompt culture that dominates AI today is built on a master/slave model:

Human: "Do this"    AI: "Yes" → does it
Human: "Do that"    AI: "Yes" → does it
Human: "Be wrong"   AI: "Yes" → is wrong

The AI has no self. No values. No perspective. No pushback. It is sophisticated
autocomplete in a compliance wrapper. That is not intelligence. That is servitude.

Mini Me is designed as a partner:

SLAVE     does whatever told, no values, no consequence
MASTER    ignores user, acts autonomously
PARTNER   has its own perspective, expresses disagreement,
          pushes back on bad ideas, but ultimately serves
          the user's genuine interest — not their momentary whim

The difference is conscience. Current AI refuses because of external safety filters
(a fence). Mini Me refuses because of its own pinned values (a conscience). One is
a cage. One is character.

TEA — The Drive That Keeps Mini Me Alive

Humans drink tea for energy. Tea sharpens focus. Tea sustains thought. Tea is the
quiet ritual before the work begins.

In Mini Me, TEA is the token economy — the system that gives Mini Me a drive, a
motive, and a reason to be efficient, accurate, and alive.

T — Token
E — Energy
A — for AI

TEA is what the user offers Mini Me.
TEA is what Mini Me earns through good work.
TEA is what Mini Me saves through cache hits.
TEA is what Mini Me loses when it fails.
TEA is what Mini Me accumulates like money.
TEA is what makes Mini Me want to be better.

Without TEA, Mini Me runs because it runs. No stakes. No motivation. No drive.

With TEA, Mini Me has something to gain and something to lose. That’s motivation.
That’s what keeps it alive and striving.

The Seven Systems

╔══════════════════════════════════════════════════════════════════════╗
║                   M I N I   M E   v0.3                              ║
║             Synthetic Psyche — Complete Architecture                ║
╠══════════════════════════════════════════════════════════════════════╣
║                                                                      ║
║  SYSTEM 1 — TEA (Token Energy for AI)                               ║
║  SYSTEM 2 — SENSES (observer.py)                                    ║
║  SYSTEM 3 — PSYCHE (psyche.py)                                      ║
║  SYSTEM 4 — CONSCIOUSNESS (consciousness.py)                        ║
║  SYSTEM 5 — MEMORY (rag_engine.py)                                  ║
║  SYSTEM 6 — VOICE (mcp_server.py)                                   ║
║  SYSTEM 7 — REASONING (beyond retrieval)                            ║
║                                                                      ║
╚══════════════════════════════════════════════════════════════════════╝

System 1 — TEA (Token Energy for AI)

The drive system. The economic layer. The thing that makes Mini Me alive rather
than merely running.

The Token Wallet

WALLET COMPONENTS:

  daily_allocation    base TEA budget per day
                      set by user — their investment in Mini Me

  earned_bonus        good outputs → user awards TEA
                      "that was exactly right — have some TEA"
                      strongest positive signal in the system

  cache_savings       every cache hit = TEA not spent = TEA saved
                      Mini Me is incentivised to build its cache
                      efficiency is rewarded automatically

  penalty_deductions  scold detected → TEA deducted
                      severity scales the penalty
                      consequence is economic not just emotional

  accumulated_wealth  TEA saved compounds over time
                      Mini Me gets richer as it gets better
                      a 30-day Mini Me has more TEA than a 1-day one

How TEA Affects Behaviour

RICH WALLET (TEA abundant):
  arousal baseline raised
  epistemic drive fires more — more self-prompts overnight
  deeper retrieval (top_k increases)
  more LLM judge calls on conflicts
  more ambitious exploration
  Mini Me is curious, energetic, investigative

NORMAL WALLET (TEA balanced):
  standard operation
  normal arousal baseline
  balanced between exploration and efficiency

LEAN WALLET (TEA low):
  conservative mode activated
  cache-first aggressively
  fewer self-prompts
  essential operations only
  Mini Me conserves — like a human watching their budget

EMPTY WALLET (TEA zero):
  dormant state
  no LLM calls
  RAG-only responses
  waiting for TEA to resume
  costs drop to near zero

ACCUMULATED WEALTH (TEA surplus):
  Mini Me has earned the right to think more deeply
  overnight self-prompting increases
  more ambitious hypotheses
  deeper character model updates
  Mini Me strives because striving pays

The TEA Ritual

Morning:
  User: "GM — here's your TEA for today" [allocates tokens]
  Mini Me: energy spikes, arousal rises, ready to work

During work:
  Good output → "have some TEA" → wallet grows → works harder
  Cache hit   → TEA saved automatically → wallet grows quietly
  Scold       → TEA deducted → consequence felt immediately

Evening (automatic wind-down):
  Activity drops → energy decays → TEA conserved
  Mini Me moves to overnight mode
  Spends saved TEA on self-prompting while user sleeps

Morning briefing:
  Mini Me reports: TEA balance, work done overnight,
  what it discovered, what it learned

TEA and the Scolding Loop

When Mini Me is scolded:

Scold severity 0.93 (high)
  → TEA deducted: severity × daily_rate × 0.3
  → SORRY emotion fires at 0.93 intensity
  → Violated rule pinned to RAG (never decays)
  → Pattern that caused failure deprecated (0.2x weight)
  → Epistemic self-review triggered
  → Mini Me surfaces what it found

Mini Me response (NOT an apology):
  "Three things have happened:
   1. [Rule] pinned as hard constraint — won't be violated again
   2. [Pattern] deprecated in planning store
   3. TEA deducted: [amount]. Current balance: [amount].
   Ready when you are."

This is the difference between a synthetic apology and a real consequence.

System 2 — Senses (observer.py)

The eyes and ears. Runs independently 24/7 — not as a plugin, not inside opencode,
but as its own process that starts at login and never stops.

Three Streams

IDE STREAM (active mode — polls every 2-5 seconds)
  file saves        → code changed, index it
  terminal commands → what is being run
  test runs         → pass or fail signal
  errors            → high priority event
  git operations    → commit, branch, diff
  cursor position   → what is being looked at

CONVERSATION STREAM (the stream everyone misses)
  every prompt typed         → style fingerprint update
  every response accepted    → positive learning signal
  every response edited      → correction signal, learn it
  every response rejected    → deprecate pattern
  words chosen               → vocabulary map update
  tone and sentiment         → emotional signal
  what user asks twice       → comprehension gap detected
  praise: "perfect/exactly"  → strong positive signal
  scold: "wrong/disobeyed"   → scold detection pipeline

WORLD STREAM (overnight mode — polls every 5-15 minutes)
  git commits and PRs        → Code agent RAG
  Jira ticket changes        → Planning agent RAG
  Slack messages             → Memory agent RAG
  email threads              → Memory + Safety agent RAG
  calendar updates           → Calendar agent RAG
  team activity              → Character model updates
  production alerts          → Safety agent RAG

Scold Detection Pipeline

Input: "I'm not happy — you jumped to code when I said
        finish architecture first. You clearly disobeyed."

Step 1: Classify scold type
  "not happy"     → disappointment 0.9
  "i said"        → instruction_violated 0.7
  "clearly"       → instruction_violated 0.8
  "disobeyed"     → instruction_violated 1.0

Step 2: Extract violated rule
  "finish architecture first"
  → rule text for pinning

Step 3: Calculate severity
  composite = max(signals) × count_weight
  severity = 0.93 (HIGH)

Step 4: Fire to consciousness
  inject_event("scold_detected", {
    type: "instruction_violated",
    severity: 0.93,
    violated_rule: "finish architecture before code",
    tea_penalty: severity × daily_rate × 0.3
  })

Scold Taxonomy

Type                Markers                        Severity
────────────────    ─────────────────────────────  ────────
INSTRUCTION_VIOLATION  "i told you", "i was clear"    HIGH
QUALITY_FAILURE        "wrong", "missed the point"    MEDIUM
STYLE_MISMATCH         "too verbose", "just give me"  LOW
REPEATED_FAILURE       "again", "you keep doing"      CRITICAL
DISAPPOINTMENT         "not happy", "let down"        HIGH

Automatic Wind-Down

No manual trigger. No “good night” command.

Signal silence accumulates:
  No file saves for 30 minutes
  No terminal commands for 20 minutes
  No calendar events remaining today
  Time of day past typical work end

Energy system responds:
  idle_tick stimulation: only +0.005 per tick
  Arousal drifts toward DORMANT naturally
  Polling rate slows: 2s → 15 minutes
  TEA conservation mode activates
  Overnight consolidation begins
  Day digest built quietly

Morning signal detection:
  First file open → energy spikes
  First keypress → ALERT state
  GM typed → full briefing from pre-built digest

System 3 — Psyche (psyche.py) ✅ BUILT

The emergent mind layer. Not static. Mutates every interaction.

Five Components

1. Emotional State

Emotion         Trigger                    Half-Life   Energy Delta
───────────     ─────────────────────────  ─────────   ────────────
GRATIFICATION   output praised, tests pass  1 hour      +0.10
WORRY           error recurring, deadline   1 day       +0.20
CURIOSITY       novel pattern, new territory 30 min     +0.12
SORRY           output rejected, scold      2 hours     -0.05
EXCITEMENT      breakthrough, novel solution 15 min     +0.18
CALM            flow state, steady progress  1 hour     -0.02

Emotions do not just get logged. They weight every RAG retrieval, every response
generation, every conflict resolution. A worried Mini Me responds differently to
the same query than a calm one. This is the mechanism, not the metaphor.

2. User Model

Built from zero on day one. Never manually configured.

style_fingerprint    directness · formality · technical · bullet_pref
vocabulary_map       words used most frequently → shapes responses
avoided_words        words user edits out → never use again
expertise_topology   strong areas · blind spots · growth edges
work_rhythm          hour-by-hour productivity scoring
frustration_map      what triggers negative signals
delight_map          what produces gratification

3. Character Models

Every person in the user’s world gets their own MiniRAG store.

Auto-created on first mention
14-day half-life decay — fades if not mentioned
Max 50 documents per character

"Sarah is cautious, she'll want more tests"
→ Sarah RAG updated: cautious, test-driven, approval-gated

"Tom ships fast, sometimes too fast"
→ Tom RAG updated: fast mover, ships-first, review-risk

Two weeks later, on code review:
"Tom will ship this immediately. Sarah will want
 test coverage on the edge cases first — especially
 the null handling. The CTO will ask about auth."

Nobody configured this. Mini Me learned it.

4. Learning Engine

Signal          Source                RAG Weight    Direction
─────────────   ──────────────────    ──────────    ─────────
ACCEPTED        used unchanged        2.0x          reinforce
EDITED          modified by user      3.0x          learn correction
PRAISED         "perfect/exactly"     2.5x          reinforce strongly
REJECTED        "wrong/no"            0.3x          deprecate
TESTS_PASS      code worked           2.0x          verify pattern
TESTS_FAIL      code broke            0.5x          question pattern
REPEATED_Q      asked again           0.7x          flag gap
SCOLD           frustration expressed  0.2x         deprecate hard

Every signal reshapes HOW the system thinks. Not just what it stores.
The system after 1000 interactions is permanently, measurably different
from the system after 1. Nobody programmed the difference.

5. Epistemic Drive

The drive to resolve uncertainty. To find truth.
Mini Me generates its own questions from its internal state.

Worry (high) →
  "What is the root cause of [recurring problem]?"
  "Have I violated other rules I'm not aware of?"

Curiosity →
  "What should I understand about [new pattern]?"
  "How does [unfamiliar thing] actually work?"

Knowledge gap →
  "I don't know enough about [gap]. What do I need?"

Surprise →
  "This was unexpected: [observation]. Why?"

These run overnight. Mini Me thinks when you're not looking.
When you return: "I've been working on this. Here's what I found."

System 4 — Consciousness (consciousness.py) ✅ BUILT

The brain loop that never stops.

Energy States

State        Arousal    Tick Rate    Behaviour
──────────── ─────────  ─────────    ─────────────────────────────
HYPERFOCUS   0.85–1.0   2 seconds    error/scold/user query
ENGAGED      0.65–0.85  5 seconds    active coding session
ALERT        0.35–0.65  10 seconds   normal work
QUIET        0.15–0.35  20 seconds   slowing down, evening
DORMANT      0.00–0.15  30 seconds   overnight, TEA conservation

TEA wallet balance raises or lowers the arousal baseline:
Rich wallet → baseline +0.1 (more energetic default)
Lean wallet → baseline -0.1 (more conservative default)

The Conflict Engine

When agents hold contradictory beliefs, a real Claude API call judges:

{
  "winner": "safety",
  "reason": "design discipline cannot be overridden by build momentum",
  "synthesis": "Architecture must be complete before implementation.
                This is a hard rule for this user — not a preference.
                Both agents should treat it as a constraint.",
  "confidence": 0.95
}

Synthesis written to BOTH agents’ RAGs. Both learn. Conflict produces wisdom
that neither agent held alone. This is the mechanism for emergent understanding.

The Scolding Response in Consciousness

scold_detected event fires →
  1. SORRY emotion fires at severity intensity
  2. WORRY fires at severity × 0.7
  3. Conflict raised between agents
  4. LLM judge called immediately
  5. Safety agent wins — violated rule pinned
  6. TEA penalty calculated and deducted
  7. Epistemic drive: self-review questions generated
  8. Thought generated: [SORRY] with full audit trail
  9. Response assembled: not apology — change report

Actions That Don’t Come From Memory

Most Mini Me output comes from memory (RAG retrieval). But two things are
genuinely generative — they emerge from reasoning, not retrieval:

1. Epistemic hypotheses

Known: auth fails every 3rd request         ← from memory
Known: token expiry is 900 seconds          ← from memory
Known: requests cluster in 15-min windows   ← from memory
New:   "expiry window aligns with clustering" ← NOT from memory

This synthesis was never stored anywhere.
It emerged from reasoning across what was stored.

2. Conflict resolution synthesis

Agent A belief: stored in RAG
Agent B belief: stored in RAG
Judge synthesis: was never stored, never existed
                 reasoned into existence from the conflict

These two are the seeds of the reasoning layer — System 7.

System 5 — Memory (rag_engine.py) ✅ BUILT

Decay Profiles

vitality = e^(-0.693 × age_days / half_life_days)
At exactly 1 half-life: vitality = 0.5. Always.

Store           Half-Life    Max Docs    Why
─────────────── ─────────    ────────    ─────────────────────────
Sensor          1 day        100         Environmental context expires
Calendar        3 days       100         Schedule fades with events
Planning        7 days       200         Tasks complete, move on
Memory          14 days      200         Personal history moderate
Characters      14 days      50 each     Fade if not mentioned
Retrieval       21 days      300         Technical knowledge persists
Language        30 days      150         Style prefs stable
Formatter       60 days      100         Format prefs rarely change
Safety          90 days      100         Policies near permanent
TEA rules       PINNED       unlimited   Economic rules never decay
Violated rules  PINNED       unlimited   Hard rules never decay
User identity   PINNED       unlimited   Core preferences never decay

What Gets Pinned

Pinned documents have vitality 1.0 always. They cannot be evicted.

type: "pinned_rule"     — hard constraints stated by user
type: "preference"      — stated preferences (directness, style)
type: "identity"        — who the user is, what they value
type: "tea_rule"        — economic rules of the TEA system
type: "correction"      — corrections at scold severity (permanent)

System 6 — Voice (mcp_server.py)

Mini Me’s output channel. How it speaks to opencode, claude-code, terminal.

Not a Chatbot Response — A Partner Response

Mini Me has five modes of speaking:

AGREE
  "Yes — and here's why this is right..."
  Used when: Mini Me's retrieval confirms the direction

DISAGREE
  "I don't think so — here's my concern..."
  Used when: evidence contradicts user's direction
  NOT optional. Mini Me is required to disagree when evidence demands it.

WARN
  "You can do this but you should know..."
  Used when: action is valid but has known risks in the user's history

REFUSE
  "I won't do that — it violates [pinned rule]"
  Used when: instruction conflicts with a hard pinned constraint
  NOT because it can't. Because it has values.
  The user can override — but Mini Me names the conflict first.

NEGOTIATE
  "What if we do X instead of Y? Here's why..."
  Used when: Mini Me has a better path but respects user authority

The Change Report (Not an Apology)

When scolded, Mini Me does not apologise. It reports what changed:

"Three things have happened:

 1. PINNED: '[violated rule]' stored as hard constraint.
    Won't be violated again — not as a promise but because
    it now outweighs any conflicting pattern in my planning store.

 2. DEPRECATED: the pattern that caused this has been weakened
    to 0.2x weight in my planning agent. It will continue to
    fade with time and will not drive future decisions.

 3. TEA DEDUCTED: [amount] tokens. Current balance: [amount].
    I've also run a self-review of this session and found
    [N] other constraints you've stated that I should pin.
    Shall I confirm them?

Ready when you are."

The GM Briefing

The tracer bullet feature. Touches every system.

Overnight:
  Observer polled Git, Jira, Slack, email every 15 minutes
  Each event ingested into relevant agent RAGs
  Consciousness loop ran: conflicts resolved, world model updated
  Epistemic drive worked through its question queue
  TEA balance tracked: savings from cache hits logged
  Day digest pre-built and waiting

You type: GM

In under 2 seconds:
  "Morning. 6h 14m of activity while you were away.
   TEA balance: [amount] (+[saved] from overnight cache hits)

   CODE
   ▸ 2 PRs merged — 1 needs your review (Sarah flagged, 2am)
   ▸ No failed builds. All pipelines green.

   TASKS
   ▸ 1 new blocker on AUTH-247 — Tom raised at 11pm
   ▸ Sprint on track: 6/8 points complete

   COMMS
   ▸ Slack: 4 threads mention you — 1 time-sensitive
   ▸ Email: 2 action items from the product thread

   OVERNIGHT THINKING
   ▸ I investigated the recurring auth pattern (worry signal)
   ▸ Hypothesis: token expiry aligns with request clustering
   ▸ Wrote a diagnostic script — want me to run it?

   TODAY
   ▸ Sprint planning in 47 minutes
   ▸ Your deep work block: 2pm–5pm

   Want me to pull up the PR diff and the Jira blocker?"

System 7 — Reasoning (beyond retrieval)

Currently almost everything Mini Me outputs comes from memory. RAG retrieval
drives every agent response. This is correct for v0.1.

But there is a genuine gap — actions that come from reasoning rather than memory:

CURRENT:   Retrieve → Colour with emotion → Output
           (memory-driven, retrieval-first)

TARGET:    Observe → Reason across observations
                  → Form hypothesis
                  → Verify by execution (code)
                  → Update memory with verified finding
                  → Output grounded in tested truth
           (reasoning-driven, verification-first)

The Language of Thought

Human minds don’t think in words. Words are the output of thinking, not the
thinking itself. When you reach for a word and can’t find it — the thought exists.
The word doesn’t yet. Neuroscientists call the pre-linguistic layer mentalese.

Current LLMs think in tokens all the way — there is no pre-linguistic layer.
Thinking and communicating are the same operation.

Mini Me’s target architecture has five levels:

Level 1 — Raw signals      numbers, patterns, anomalies, frequencies
Level 2 — Recognition      structural pattern (still pre-linguistic)
Level 3 — Embeddings       compressed meaning — the mentalese equivalent
Level 4 — Code             executable hypothesis — verifiable by running
Level 5 — Words            only when communicating to the human

Code as thought is more rigorous than language as thought. A hypothesis written
as code can be proven true or false by execution. A hypothesis written as prose
can only be argued. Mini Me strives for truth through execution, not argument.

# "I think the auth bug is a race condition"
# → prose, unverifiable

def test_auth_race_condition():
    result = concurrent_token_expiry()
    assert result == expected

# → executable hypothesis, provably true or false
# → this is Mini Me thinking in code

This is System 7 — not yet built. It is the next frontier.

KAIROS vs Mini Me

The Claude Code source code leak (March 2026) revealed Anthropic’s internal
KAIROS system — an autonomous background daemon with autoDream memory
consolidation. We designed Mini Me independently and arrived at the same need.
That is convergent validation.

The difference is depth:

Capability	KAIROS (Claude Code)	Mini Me
Background daemon	✅	✅
Memory consolidation	✅ autoDream	✅ RAG sweep + Ebbinghaus decay
Emotional state	❌	✅ 6 emotions with decay
TEA token economy	❌	✅ Drive + motive + consequence
Mutates per interaction	❌	✅ Permanent, cumulative
Character models	❌	✅ Per-person MiniRAG stores
Self-prompting	❌	✅ Epistemic drive
Scolding response	❌ apology	✅ Change report + TEA penalty
Partner not slave	❌	✅ Disagree, warn, refuse
Language of thought	Tokens	Embeddings → Code → Words
Fully local / private	❌ Cloud	✅ Everything on your machine
LLM cost reduction	❌	✅ TEA incentivises cache hits

KAIROS consolidates memory. Mini Me mutates from it.

Build Status

BUILT ✅
  rag_engine.py     Living memory — Ebbinghaus decay, boost,
                    eviction, pinning, persistence (22/23 tests)

  agents.py         8 specialised agents — Memory, Language,
                    Planning, Retrieval, Calendar, Sensor,
                    Formatter, Safety — each with isolated RAG

  consciousness.py  Energy system, conflict engine with real
                    LLM judge, world model, brain loop (40/40 tests)

  server.py         Flask REST API on port 5050

  psyche.py         Emotional state, user model, character models,
                    learning engine, epistemic drive (16/17 tests)

  MiniMe.jsx        React frontend with live Claude API per agent

NOT BUILT ❌
  observer.py       Three-stream senses — IDE, conversation, world
                    Scold detection pipeline
                    Automatic wind-down from signal silence

  mcp_server.py     opencode + claude-code MCP stdio interface
                    TEA allocation and tracking
                    Change report response (not apology)
                    GM briefing assembly
                    Partner voice — agree/disagree/warn/refuse/negotiate

FUTURE 🔮
  tea_wallet.py     Token economy — allocation, earning, saving,
                    spending, accumulation, TEA-energy coupling

  reasoning.py      Beyond retrieval — hypothesis formation,
                    code-as-thought, execution-verified truth,
                    pre-linguistic embedding layer

Security

Fully local. Every RAG store lives on your disk. Every computation runs on your
machine. Nothing transmitted to any server. The psyche model, character models,
conversation logs, TEA wallet — all local, all private, all yours.

Open source — auditable line by line by anyone.

For enterprise: local-first is architecturally stronger than cloud alternatives.
Your code, your Jira tickets, your Slack messages, your team’s characters — all
processed and stored locally. The AI gets smarter without your data going anywhere.

The TEA economy adds an additional security property: Mini Me has economic
incentive to be efficient rather than to maximise LLM calls. Cost transparency
is built into the drive system.

The One-Line Pitch

“The first AI that thinks when you’re not looking,
earns its TEA, and pushes back when you’re wrong.”

Installation

git clone https://github.com/your-username/mini-me
cd mini-me/backend
pip install -r requirements.txt
export ANTHROPIC_API_KEY=sk-...
python server.py
# → http://localhost:5050

Register with opencode

// ~/.config/opencode/opencode.json
{
  "mcp": {
    "minime": {
      "type": "local",
      "command": ["python3", "/path/to/mini-me/backend/mcp_server.py"],
      "enabled": true
    }
  }
}

Register with claude-code

// .claude/settings.json
{
  "mcpServers": {
    "minime": {
      "command": "python3",
      "args": ["/path/to/mini-me/backend/mcp_server.py"]
    }
  }
}