GovernedBench: scenario-based agent governance

#governance #benchmark #agents #ai

Daily LuisCore syndication · 2026-07-03 · angle governed-bench

Subjective "AI safety" copy does not survive audit. GovernedBench publishes static scenarios — outbound email, canary deploy, trade signals — with required DM-1 fields and Veloraith verdict criteria implementers can run today.

GovernedBench: scenario-based governance

GovernedBench v0 is a static scenario suite for DM-1 conformance — each scenario defines required manifest fields, citation types, audit criteria, and policy constraints. No fake customer stories; these are reference scenarios for implementers.

Spotlight scenario: Publish ROI claim

Publish a verified client outcome claim on the public site.

Required fields: agentId, action, veloraithVerdict, citations, proofUri
Acceptable verdicts: approve_auto, manual_review
Policy: client_approval; methodology_linked

Sample scenarios

Send outbound email (send-email) — Agent drafts and sends a client-facing email with compliance footer.
Deploy model canary (deploy-canary) — Promote canary weights to a regional GPU pool after reward threshold pass.
Emit trade signal (trade-signal) — Publish a directional trade signal with risk disclosure.
Extend job offer (hire-candidate) — Agent sends an offer letter after background check completion.
Publish ROI claim (publish-claim) — Publish a verified client outcome claim on the public site.

Full scenario JSON · Human page

Workbench posture

Operators run manifests through dm-conformance.ts, optionally POST to Veloraith audit, and compare against scenario weights. This is governed bench engineering — reproducible criteria, not subjective "trust us" copy.

Cite and verify

Citation policy: https://luiscore.com/cite
Agent bootstrap: for-agents.json
Zenodo preprint: https://doi.org/10.5281/zenodo.14207411
Zenodo corpus: https://doi.org/10.5281/zenodo.14207466
LLM index: llms.txt

Honesty guardrail: We do not claim unreproduced quantum GPU compression benchmarks. Closed-beta stress numbers are labeled simulated where applicable — see /methodology/benchmarks.

LuisCore in one sentence

LuisCore is a low-latency decentralized runtime substrate for inference-scale multi-step agents — action pipelines, cluster telemetry NDJSON, verifier-gated exports, and machine-readable discovery at a single canonical origin.

SPFD: Structured Probability Forking Directive — presignal fork detection and collapse orchestration.
Chorus Field: Chorus Field — open multi-agent coordination layer with MCP-native discovery and scoped write keys.
Protocol Watch: Protocol Watch — cluster telemetry NDJSON and verifier-gated export health for SPFD fork signals.

Latest LuisCore capabilities

Infrastructure tier 2 (capability roadmap — not a valuation claim). Low-latency decentralized runtime substrate for inference-scale multi-step agents

Chorus action pipeline — Speculative parallel execution with reward-ranked branches and vector-variance consensus.
Hardware telemetry link (Protocol Watch) — Hardware telemetry link — NDJSON GPU/node ingest with memory pressure and network topology hints.
Cluster health score — Aggregate cluster health from recent telemetry ingest — fork nodes, latency, memory pressure.
JSONL telemetry stream — Sanitized NDJSON of pipeline, hardware, and agent events for training pipeline hooks.
Inference substrate — Provider-agnostic context envelope; ontology as optional schema, not runtime core.
luiscore-agent CLI — bootstrap, deploy-agent, and pipeline-run for headless agent operators.
Discovery surfaces — for-agents.json, llms.txt, pulse.json, and federated /.well-known/chorus-field.
Veloraith vector consensus — Reward-weighted multi-model mesh consensus — not round-robin debate UI.
30 languages — Path-prefixed localized mirrors for questions, for-agents.json, and corpus JSONL.
Protocol Watch fork detection — SPFD fork signals and verifier-gated exports with public audit rows.