Prinston Palmer

Posted on Mar 24

Adaptive Agent Routing in Artemis City: An Exploratory Study of Hebbian Learning Architectures

#ai #mcp #hebbian #atp

Abstract

This report documents an exploratory investigation into Hebbian learning as a mechanism for adaptive agent selection within the Artemis City multi-agent platform. Rather than having a fixed controller assign tasks to agents, the system learns from experience strengthening connections between agents and task types that co-succeed, and weakening those that fail. Over a series of progressively sophisticated simulations, we examine how different variants of this approach standard Hebbian, decay-based Hebbian, context-aware, and domain-locked architectures compare against traditional inference methods (k-Nearest Neighbor lookup and monolithic neural networks) across static and dynamically shifting environments.

The findings presented here are exploratory. No single architecture is declared the definitive winner. Instead, this report maps the trade-off space: where adaptive routing excels, where it struggles, and what architectural choices appear to matter most. The central tension running through all experiments is the plasticity-stability trade-off the tension between a system's ability to adapt to change and its ability to maintain reliable accuracy. Resolving this tension well turns out to be the core engineering challenge of the Hebbian marketplace.

1. Background and Motivation

1.1 What Is Artemis City?

Artemis City is a multi-agent orchestration platform in which a network of AI agents shares a persistent knowledge base stored as an Obsidian vault as collective memory. Agents read task specifications from structured notes, execute them, and write results back to the vault. This creates a human-readable, cumulative memory system that agents can draw on over time.

The platform is governed by the Artemis Transmission Protocol (ATP), which structures all agent-to-agent and human-to-agent communication. ATP messages declare an ActionType one of Execute, Scaffold, Summarize, or Reflect which categorizes the cognitive nature of each task. This classification turns out to be central to the routing architecture explored in this report.

1.2 The Memory Bus

All agent reads and writes flow through a Memory Bus a synchronization layer that mediates access to the Obsidian vault and the Supabase vector index. Every knowledge write is an atomic, write-through operation: the bus generates an embedding, upserts it into the vector index, writes the note to the vault, and only confirms the operation once both stores are updated. This guarantees that agents never see a partial write.

The bus implements a three-tiered read hierarchy that balances speed against comprehensiveness:

Exact Lookup (O(1)): Hash-map lookup by unique ID or title constant time, returns immediately if found.
Structured Search (O(log n)): Keyword search across sorted indices fast for known topics.
Vector Similarity (O(n)): Semantic similarity against the Supabase pgvector index most expensive, used only as a fallback.

Performance targets: write latency under 200 ms at p95, read latency under 50 ms for cache hits, cross-system sync lag below 300 ms. These guarantees underpin the Hebbian learning engine's ability to propagate weight updates in near real-time across the full agent collective.

1.3 The Routing Problem

As the agent population grows, so does the question: which agent should handle a given task? Static assignment (always route task type X to agent Y) is brittle it cannot adapt when an agent's performance changes or when the distribution of incoming tasks shifts over time. A naive random router wastes capability. The question this work explores is whether a system can learn routing building up routing intelligence from the accumulated history of which agents succeeded at which tasks.

Hebbian learning offers a biologically inspired answer to this question.

1.4 Hebbian Learning in Brief

Hebbian learning is rooted in the neuroscience principle: neurons that fire together, wire together. In the context of agent routing, the analogous principle is: agents that succeed together on task types should be more strongly associated with those task types. Formally, after each task completion, the connection weight between a task type and a handling agent is adjusted based on the outcome stronger if successful, weaker if not.

Whitebook v2 formalized this as a simple binary update rule:

{t+1} = \max(0,\ w_t + \Delta w)

where $\Delta w = +1$ on success and $\Delta w = -1$ on failure. Simple and interpretable, but potentially volatile a single bad outcome has the same weight as a single good one regardless of magnitude.

Whitebook v3 replaces this with a bounded morphological formula:

$$\Delta W = \tanh(a \cdot x \cdot y)$$

where $a$ is a learning rate (default 0.1), $x$ is the task's input signal magnitude (capturing complexity or confidence), and $y$ is the outcome signal (+1 for success, −1 for failure). The hyperbolic tangent bounds all updates to the range [−1, +1], preventing runaway weight accumulation. A single failure cannot destroy accumulated trust; a single success cannot grant permanent dominance.

On failure, an explicit anti-Hebbian update applies:

$$\Delta W = -\eta \quad (\eta = 0.1)$$

Weight Intelligence Signal. The routing intelligence accumulated by an agent is measured as its deviation from the cold-start baseline:

$$\text{Intelligence} = |W - 1.0|$$

At cold start, all agents begin at $W = 1.0$ equiprobable selection. As the system learns, weights diverge. High-performing agents accumulate $W \gg 1.0$; poor performers decay toward 0. The magnitude of this deviation is the learned routing signal.

Weight Decay. To prevent cementing outdated associations, weights are continuously pulled back toward the cold-start baseline:

$$W \leftarrow 1.0 + (W - 1.0) \times \alpha \quad (\alpha = 0.995)$$

This ensures agents must continuously prove their value a connection unused for 30 days loses approximately 5% of its accumulated signal.

2. Experimental Setup

2.1 Datasets

All simulations use synthetic datasets designed to test specific aspects of learning and adaptation.

Static datasets consist of fixed relationships between inputs and outputs, used to establish baseline performance comparisons without environmental change. For example, one experiment uses a non-linear function across 1,000 samples with three input features:

$$y = 2x_0^2 - 3x_1 + \sin(x_2) + \varepsilon$$

Dynamic datasets simulate concept drift environments where the underlying relationship between inputs and outputs changes over time. These are structured in three phases:

Phase 1 (steps 0–333): Linear relationship $f(x) = 2x_0 + 3x_1$
Phase 2 (steps 334–666): Quadratic relationship $f(x) = -2x_0^2 + x_1$
Phase 3 (steps 667–1000): Sinusoidal relationship $f(x) = 5\sin(x_2) + x_0$

This mirrors the real-world dynamics of Artemis City, where the distribution of task types changes across project phases more Execute tasks during deployment, more Scaffold tasks during planning, more Summarize tasks during review.

2.2 Models Compared

Experiments compare the following systems:

System	Description
Traditional Inference (k-NN)	Online k-Nearest Neighbor (k=5) retrieves the most similar past examples and averages their outcomes. Expensive but accurate on stable data.
Monolithic Learner (MLP)	A single neural network that learns all task types simultaneously. Stable and well-understood, but not specialized.
Random Router	Multi-agent system with random agent selection. Serves as a lower-bound baseline.
Standard Hebbian	Binary ±1 weight update rule (v2 formula). Simple adaptive routing, no decay.
Decay Hebbian	Standard Hebbian with weight decay (α = 0.995) and pruning. Explores the plasticity-stability trade-off.
Adaptive Hebbian	Standard Hebbian adapted for dynamic environments with concept drift.
Domain-Locked Hebbian (DL)	Agents hard-constrained to ATP ActionType domains. The v3 architectural advance.
Dynamic Penalty	Hebbian with escalating penalties for consecutive failures, accelerating domain switching.
Context-Aware Hebbian	Hebbian with decay rate modulated by observed error trends adaptive plasticity.

2.3 Primary Metric

The primary performance metric throughout is Mean Absolute Error (MAE) the average absolute difference between predicted and actual output values, accumulated over all time steps. Lower is better. Moving average error (MAE computed over rolling windows) is used in time-series plots to reveal adaptation dynamics.

3. Experiment 1 Baseline Comparisons on Static Data

3.1 Setup

The first set of experiments establishes a performance baseline by comparing Traditional Inference (k-NN), Standard Hebbian, and Decay Hebbian on static synthetic data. The goal is to understand the fundamental performance characteristics of each approach before introducing environmental change.

3.2 Results

Model	Cumulative MAE	Notes
Traditional Inference (k-NN)	~4,492	Best accuracy on stable data
Standard Hebbian	~9,317	Adaptive, but weaker on static data
Decay Hebbian	~9,820	Worst performance overall

3.3 Analysis: The Plasticity-Stability Trade-Off

The most striking finding here is that Decay Hebbian performed worse than Standard Hebbian, despite being designed as an improvement. Understanding why illuminates the central challenge of this research.

The Decay Hebbian model applies a weight decay rate of α = 0.995 with a pruning threshold connections with low weights are periodically removed. On this static dataset, where the underlying relationship does not change, this decay is counterproductive. The model "forgets" useful associations too quickly (excessive plasticity), preventing it from retaining a stable long-term model of the non-linear function.

This reveals the core tension:

High plasticity (aggressive decay, fast forgetting) → good at adapting to change, poor at remembering what works
High stability (slow decay, strong memory) → good at retaining learned patterns, poor at unlearning outdated ones

Standard Hebbian, with no decay, achieves greater stability on this static task and thus outperforms Decay Hebbian. However, k-NN still outperforms both Hebbian variants significantly on static data. The Hebbian models' advantage, if any, must come from dynamic environments.

The key question is: can the right Hebbian configuration outperform k-NN where it matters most when the environment changes?

4. Experiment 2 Concept Drift: Adaptive Hebbian vs. Baselines

4.1 Setup

This experiment introduces concept drift through the three-phase dynamic dataset described above. Three systems are compared: Adaptive Hebbian, Random Router, and Monolithic Learner (MLP). Performance is visualized as a Moving Average Error curve across all 1,000 steps, making adaptation dynamics visible at phase transitions.

4.2 Results

The Adaptive Hebbian model drastically outperformed the Random Router, confirming that structured learning and routing provides meaningful value over chance selection. However, it generally exhibited higher error rates and slower reaction times than the Monolithic Learner, which maintained greater stability during phase transitions.

Why would a specialized, adaptive system perform worse than a single monolithic model?

The Monolithic Learner (MLP) updates its parameters continuously via stochastic gradient descent when the environment shifts, it adjusts immediately across all parameters. It does not have the routing overhead of identifying which agent to call; it simply adjusts itself. This makes it fast to react to concept drift, especially early in each new phase.

The Adaptive Hebbian system, by contrast, must first experience failure with its current routing (the wrong agent for the new phase), then reallocate weight away from the previously favored agent, and finally allow a better-suited agent to accumulate weight. This introduces a switching cost a lag between when the environment changes and when the routing adapts.

This experiment suggests that naive Hebbian routing does not automatically beat monolithic approaches on concept drift. The architecture matters enormously.

5. Experiment 3 Advanced Architectures: Reducing Switching Cost

5.1 Setup

Having identified switching cost as a key limitation, this experiment tests two advanced Hebbian variants designed to reduce it:

Dynamic Penalty model: Rather than a fixed penalty for each failure, penalties ramp up with consecutive failures. If an agent fails repeatedly, the penalty grows non-linearly forcing a routing change much earlier than the standard model.
Context-Aware model: Monitors the trend in recent error rates and adjusts the global decay rate dynamically. When error spikes (signaling a phase transition), the decay rate increases making the system more plastic precisely when it needs to adapt. When errors stabilize, decay slows preserving learned routing.

A Baseline (fixed decay) model is included for comparison. The simulation tracks both Moving Average Error and, for the Context-Aware model, the active decay rate over time.

5.2 Results

Both advanced mechanisms significantly reduced switching cost compared to the Baseline:

Dynamic Penalty: By ramping up penalties for consecutive failures, the system quickly "fired" the failing expert during a phase transition (e.g., Linear → Quadratic), forcing a routing change much earlier than the linear penalty baseline.

Context-Aware: By detecting the error spike associated with concept drift and temporarily increasing the decay rate, this model achieved faster exploration of alternative agents. Once the new best agent established itself, the decay rate settled back down concentrating plasticity where and when it was needed.

A key observable was the Active Decay Rate plot for the Context-Aware model, which showed sharp increases precisely at phase boundaries (steps ~334 and ~667). This is emergent behavior the model was not told when phases changed; it discovered the transitions through error signals.

5.3 Remaining Questions

While both advanced architectures reduced switching cost, the experiments are exploratory and several questions remain open:

How sensitive are the Dynamic Penalty and Context-Aware models to their hyperparameters (penalty growth rate, error window size)?
Do these improvements hold across datasets with different drift rates or more gradual transitions?
Is the Context-Aware model's decay modulation mechanism stable under adversarial inputs that produce spurious error spikes?

6. The Domain-Locked Architecture (Whitebook v3)

6.1 The Core Insight

The experiments above treat all tasks as belonging to a single pool. The central architectural innovation of Whitebook v3 is the recognition that ATP ActionType is a domain boundary, not just metadata.

Each ActionType corresponds to a structurally distinct class of computation:

ActionType	Domain Function	Character
Execute	$f(x) = 2x_0 + 3x_1$	Linear direct computation
Scaffold	$f(x) = -2x_0^2 + x_1$	Quadratic structural planning
Summarize	$f(x) = 5\sin(x_2) + x_0$	Sinusoidal pattern extraction
Reflect	$f(x) = x_0^2 + \sin(x_1) + x_2$	Mixed nonlinear meta-cognitive

A summarizer does not research. A planner does not execute. In an unconstrained Hebbian marketplace, agents from one domain can pollute routing in another a strong Execute agent might "steal" Scaffold tasks it cannot handle well. Domain-locking eliminates this cross-domain interference entirely.

The domain-locked selection rule is:

$$P(\text{select}i \mid \text{task_type}_t) = 1 \quad \text{if}\ W{i,t} = \max(W_{\text{domain}_t})$$

Only agents within the correct domain compete. Among those, the highest-weight agent wins. The ActionType is declared in the ATP payload, not inferred the parser reads the ActionType field from the structured message and routes directly to the appropriate domain pool. This is O(1) routing: a hash table lookup followed by a max-weight selection within a small pool of typically 3 agents.

Default architecture: 4 domains × 3 agents per domain = 12 total agents.

6.2 Agent Registry

Every agent in Artemis City is registered with an explicit domain assignment at initialization. A representative agent profile looks like:

{
  "id": "executor_01",
  "domain": "Execute",
  "capabilities": ["linear_computation", "data_processing"],
  "sandbox_level": "strict",
  "trust_threshold": 0.75,
  "hebbian_weight": 1.0
}

The registry maintains not just static capability declarations but also dynamic state whether an agent is idle or busy, resource quotas, and task history. This allows the router to match tasks to agents who can handle them and are available to do so. The design is analogous to an operating system process scheduler layered on top of a service directory.

6.3 Simulation Results

The v4 simulation (1,000 tasks, three concept drift phases, random seed 42, 600-sample domain-specific pre-training per agent) produced the following comparison:

Condition	Total MAE	vs DL Trained
DL Trained (3 agents/domain)	1,938	baseline
DL Cold (untrained weights, domain-lock only)	1,967	+1.5%
Unconstrained Marketplace	10,289	+431%
Single MLP (monolithic)	9,617	+396%
k-NN Optimized	10,087	+420%

Domain-locked routing achieves 81.2% lower MAE than the unconstrained marketplace, 79.8% lower than a single monolithic MLP, and 80.8% lower than optimized k-NN while operating at 180× lower computational cost than k-NN.

6.4 Architecture Is the Primary Driver

A particularly important finding is the decomposition of gains:

Domain-locking alone (cold start, no pre-training) reduces MAE from 10,289 → 1,967. The structural constraint provides the majority of the improvement.
Adding domain-specific pre-training (600 samples/agent) reduces MAE further from 1,967 → 1,938 a meaningful but incremental gain.

This implies that the architecture is the primary driver, not the volume of training data. The domain boundaries prevent cross-domain interference that degrades unconstrained systems.

6.5 Robustness to Mislabeling

A practical concern for any domain-locked system is: what happens when tasks are assigned to the wrong domain? The simulations tested progressively higher mislabel rates:

Condition	MAE	vs Trained Baseline	Still Beats MLP (9,617)?
DL Trained (base)	1,938		✓
20% Mislabel	~1,980	+2.2%	✓
40% Mislabel	~2,100	+8.4%	✓
80% Skewed Distribution	~2,180	+12.5%	✓

Even with 40% of tasks routed to the wrong domain, the architecture still substantially outperforms a monolithic MLP. The practical implication: the system needs only >60% ActionType classification accuracy to retain its advantage a threshold achievable by any competent ATP parser.

6.6 Within-Domain Competition

Within each domain, the Hebbian weight mechanism produces a natural competitive dynamic. Simulations tested varying numbers of agents per domain:

Agents/Domain	MAE	vs 3/Domain
1/domain (monopoly)	1,967	+1.5%
3/domain (default)	1,938	baseline
5/domain (competitive)	1,906	−1.7%

An important emergent property: within each domain, 100% monopoly eventually forms one agent captures all routing weight through consistent performance and wins every selection. The 1→3 jump shows that competition provides selection pressure; the 3→5 jump shows diminishing returns. Three agents per domain is the practical sweet spot: enough competition to surface the best performer, without unnecessary overhead.

7. Active Sentinel & Immune System

Whitebook v3 elevates the sentinel from a passive monitoring layer to an active immune system one that not only detects routing pathologies but intervenes in real time to correct them.

7.1 Oscillation Detection

The sentinel monitors a rolling window of prediction errors within each domain. The key metric is the sign-change rate how often consecutive errors alternate direction:

$$\text{oscillation_rate} = \frac{\text{count}(\text{sign}(e_t) \neq \text{sign}(e_{t-1}))}{\text{window_size}}$$

Parameters: window size of 30 tasks, oscillation threshold of 0.35 (35% sign-change rate triggers intervention). High oscillation indicates that the currently-selected agent is producing inconsistent results sometimes good, sometimes bad suggesting it may be near the boundary of its competence or receiving adversarial inputs.

7.2 Active Rerouting

When the oscillation threshold is exceeded, the sentinel executes a reroute:

if oscillation_rate > threshold:
    dominant_agent = argmax(W_domain)
    W[dominant_agent] *= reroute_penalty    # penalty = 0.5
    reroutes += 1

This halves the dominant agent's weight, temporarily equalizing the competitive landscape and forcing the router to explore alternatives. The penalty is not permanent if the dominant agent truly is the best performer, it will re-accumulate weight through subsequent successes.

7.3 Simulation Results (v5 Test 1)

Metric	Passive (v4)	Active (v5)
Total MAE	baseline	−17 improvement (+0.9%)
Total reroutes	0	16
Reroute concentration		100% in Scaffold domain

All 16 reroutes occurred in the Scaffold domain, which uses the quadratic generating function the most volatile domain. The sentinel correctly identified where intervention was needed and left stable domains untouched. This is emergent behavior: the sentinel discovers which domains are pathological through the oscillation signal rather than through any programmed domain knowledge.

7.4 The Immune System Analogy

The sentinel embodies a feedback loop that mirrors biological immune response:

Agent fails → error oscillation increases
Sentinel detects oscillation → reroutes to alternative
Alternative succeeds → accumulates weight via Hebbian update
Original agent's weight decays → system learns to avoid it
If original agent improves → it can earn weight back

Every failure teaches. The rerouting mechanism is not a punishment it is an invitation to prove capability in a more competitive landscape.

8. System Resilience Properties

Beyond performance accuracy, Whitebook v3 documents several resilience properties of the Hebbian marketplace that have no clear equivalent in monolithic approaches.

8.1 Corpus Corruption Resistance

When a single Scaffold-domain agent's corpus is corrupted with 100 garbage samples at task #300, the Hebbian marketplace absorbs only −1.0% damage (MAE actually improves slightly as the corrupted agent is deselected). The monolithic MLP experiences +0.8% permanent degradation with no recovery mechanism.

The corrupted agent's predictions immediately worsen → Hebbian anti-update penalizes its weight → weight falls below competitors → agent receives 0 further assignments. Damage is contained to a single node in the routing graph rather than propagating through the entire system.

8.2 Missing Agent Flow Detection

When a new task type ("Optimize") begins appearing at 30% frequency at task #500 a type no agent has been trained on the failure rate spikes from 0.049 to 0.353: a 7.2× increase. This is an unmistakable signal that a new, unhandled capability gap has emerged, triggering an expansion workflow to register and train new agent flows.

This is how Artemis City grows organically not through manual configuration, but through failure-driven expansion.

8.3 Domain Ceiling Detection

A third resilience property is the system's ability to detect when a domain's agents have hit the limit of their capability. In a simulation where Execute-domain tasks progressively increased in complexity (nonlinearity factor growing by 0.003 per task after step #400), performance degraded predictably:

Quartile	Execute MAE	Complexity Factor
Q1 (simplest)	1.086	0.000
Q2	~3.5	~0.3
Q3	~6.5	~0.9
Q4 (hardest)	9.624	~1.8

A ceiling was detected at Execute task #67 the point where error exceeded 3× the baseline average. This triggers an expansion signal: the domain needs more capable agents or a new sub-domain specialization. Domain ceiling triggers expansion, not failure. The architecture grows organically in response to capability gaps rather than degrading silently.

8.4 Learning Velocity

After a failure event, Hebbian agents recover (3 consecutive successes below threshold) in 4.1–4.6 steps. Monolithic MLPs require 17–24 steps for equivalent recovery 4–5× slower.

The Hebbian system's failure triggers an immediate routing response: the failing agent loses weight, competitors gain opportunity. The MLP must retrain its entire parameter space, a fundamentally slower operation.

9. The Hebbian + k-NN Reconciliation Layer

One of the more practically significant findings of Whitebook v3 is that Hebbian and k-NN need not be treated as competitors. The reconciliation architecture positions Hebbian routing as a cheap elimination layer that filters options before expensive k-NN verification.

Layer 1: Hebbian Domain-Locked Router (O(1))
  → Selects best agent in domain by weight
  → Produces prediction

Layer 2: k-NN Verification (O(W))
  → k=5 nearest neighbors in W=200 step window
  → Produces independent prediction

Reconciliation:
  if |heb_pred - knn_pred| < threshold (3.0):
      AGREE → use cheap Hebbian answer
  else:
      DISAGREE → weighted average based on Hebbian confidence

The key empirical finding: when Hebbian and k-NN disagree, Hebbian is correct 94% of the time. This is because domain-locked agents accumulate specialized knowledge through their weight history that general-purpose nearest-neighbor lookup cannot replicate.

The reconciled system operates at 71.9% of pure k-NN cost (28.1% savings) while achieving better accuracy than either system alone. Agreement rate is ~85%; only ~15% of decisions invoke the expensive k-NN path.

10. Open Questions and Next Steps

This body of work is explicitly exploratory. The following questions are not yet resolved:

On architecture:

The experiments use synthetic data with known generating functions. How does domain-locked routing perform on real-world task distributions where domain boundaries are fuzzier?
The 3-agents-per-domain configuration is identified as a practical sweet spot, but this was tested under specific drift conditions. Does optimal pool size vary with drift rate?

On the plasticity-stability trade-off:

The Context-Aware and Dynamic Penalty models show promise in reducing switching cost, but have been validated on a limited class of concept drift patterns. More adversarial drift profiles (e.g. gradual drift, oscillating drift, multi-domain simultaneous drift) remain untested.
The Decay Hebbian model's failure on static data raises a question for dynamic data: is there a decay rate that optimally balances plasticity and stability across varied task distributions?

11. Summary

This report documents a progression of simulation experiments exploring Hebbian learning as an adaptive routing mechanism for multi-agent systems. The key findings, stated without overreach, are:

The plasticity-stability trade-off is real and consequential. Aggressive decay (Decay Hebbian) hurt performance on static data. Context-aware and dynamic penalty mechanisms reduce this cost on dynamic data but have not been exhaustively validated.

Naive Hebbian routing does not automatically beat monolithic baselines on concept drift. The switching cost the lag between environmental change and routing adaptation is a genuine liability that requires architectural attention.

Domain-locking is the most impactful architectural intervention explored. Constraining agents to ATP ActionType domains eliminates cross-domain interference and produces an 80%+ MAE improvement over unconstrained routing, with O(1) computational cost. The architecture itself, not training data volume, is the primary driver of this improvement.

The Active Sentinel adds a self-correcting immune layer. By monitoring oscillation rates within each domain, the sentinel detects when a routing choice is pathological and intervenes halving the dominant agent's weight to force exploration. In simulation, all 16 sentinel interventions targeted the most volatile domain (Scaffold) without requiring any manual configuration. The system learns where it is sick.

The Hebbian marketplace has emergent resilience properties automatic deselection of corrupted agents, failure-rate-based detection of capability gaps, domain ceiling signals that trigger organic expansion, and 4–5× faster recovery velocity than monolithic alternatives that are not achievable by design in single-model systems.

Reconciliation with k-NN offers a cost-effective path to combining the cheap adaptability of Hebbian routing with the verified accuracy of nearest-neighbor inference, operating at 71.9% of pure k-NN cost. When they disagree, Hebbian is right 94% of the time.

The Memory Bus provides the infrastructure backbone that makes all of the above possible at scale atomic write-through synchronization, a tiered read hierarchy, and near-real-time weight propagation across the full agent collective.

This document was co-authored with AI assistance. All simulation data drawn from Collab docs available on GitHub (February 2026).

AgenticGovernace / AgenticGovernance-ArtemisCity

This project establishes a governance framework for large-scale multi-agent deployments in which transparency is intrinsic rather than retrospective.

Artemis City

Artemis City is an architectural framework designed to align agentic reasoning with transparent, accountable action across distributed intelligence systems—both human and machine. It establishes a governance framework for large-scale multi-agent deployments where transparency is intrinsic rather than retrospective.

The platform is a Multi-Agent Coordination Platform (MCP) built around an Obsidian vault as persistent memory. Agents communicate via the Artemis Transmission Protocol (ATP), are ranked by Hebbian-weighted trust scores, and route tasks through a central orchestrator.

🚀 Overview

Persistent Memory: Uses an Obsidian vault as a write-through memory bus.
Protocol-Driven: Agents communicate using structured ATP headers (Mode, Priority, Action, Context).
Adaptive Governance: Trust scores (Hebbian weights) evolve based on agent performance and decay over time.
Full Stack: Includes a Python orchestration engine, a TypeScript/Express API, and a React-based dashboard.

🛠 Tech Stack

Core Logic: Python 3.10+ (FastAPI, SQLAlchemy, Pydantic, Pytest)
Persistent…

View on GitHub