<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Saravanan Jaichandaran</title>
    <description>The latest articles on DEV Community by Saravanan Jaichandaran (@saravananj2294).</description>
    <link>https://dev.to/saravananj2294</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3943319%2F74793650-b3ea-4bef-b3bb-96dd9417e39e.jpg</url>
      <title>DEV Community: Saravanan Jaichandaran</title>
      <link>https://dev.to/saravananj2294</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/saravananj2294"/>
    <language>en</language>
    <item>
      <title>Five primitives I exercised end-to-end on world-model-mcp's own repo</title>
      <dc:creator>Saravanan Jaichandaran</dc:creator>
      <pubDate>Thu, 21 May 2026 05:36:21 +0000</pubDate>
      <link>https://dev.to/saravananj2294/five-primitives-i-exercised-end-to-end-on-world-model-mcps-own-repo-moo</link>
      <guid>https://dev.to/saravananj2294/five-primitives-i-exercised-end-to-end-on-world-model-mcps-own-repo-moo</guid>
      <description>&lt;p&gt;I shipped four releases of world-model-mcp in twelve days. v0.6.1 to v0.7.2. The pitch is "AI coding agents lose context across compaction, repeat the same mistakes, and hallucinate APIs that do not exist." Before I write more about it I wanted to demonstrate the primitives on a real codebase, with real outputs, not screenshots someone has to take my word for.&lt;/p&gt;

&lt;p&gt;The codebase is the project's own repo. I ran python -m world_model_server.cli setup (it auto-seeded 598 entities from the source), then ran scripts/demo_seed.py which inserts the small set of constraints, facts, and a compaction audit row that real PostToolUse / record_correction hook activity would write organically over one to two weeks of development with Claude Code installed.&lt;/p&gt;

&lt;p&gt;Every output block below is verbatim from the actual SQLite database after running the actual command. You can reproduce every output here by cloning the repo, running python -m world_model_server.cli setup, then python scripts/demo_seed.py. The script is idempotent and supports --dry-run and --reset.&lt;/p&gt;

&lt;p&gt;Install: pip install world-model-mcp. Source: github.com/SaravananJaichandar/world-model-mcp.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. A learned constraint denying an edit at the PreToolUse boundary&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When a developer corrects the agent (rewrites console.log to logger.debug), the PostToolUse hook records the diff and infers a rule. Once that rule's violation count crosses the hard-threshold (severity=error, count ≥ 3), the next attempt is denied at PreToolUse before the tool runs.&lt;/p&gt;

&lt;p&gt;The constraint as the graph stores it:&lt;br&gt;
{&lt;br&gt;
  "rule_name": "no-console-log",&lt;br&gt;
  "severity": "error",&lt;br&gt;
  "violation_count": 5,&lt;br&gt;
  "description": "Use logger.debug() not console.log() in TypeScript source. Production logs route through pino; console.log bypasses formatting and breaks downstream parsers.",&lt;br&gt;
  "file_pattern": "*.ts",&lt;br&gt;
  "examples": [&lt;br&gt;
    {"incorrect": "console.log", "correct": "logger.debug"}&lt;br&gt;
  ]&lt;br&gt;
}&lt;br&gt;
The PreToolUse hook's actual JSON response when an edit containing console.log reaches it:&lt;br&gt;
{&lt;br&gt;
  "hookSpecificOutput": {&lt;br&gt;
    "hookEventName": "PreToolUse",&lt;br&gt;
    "permissionDecision": "deny",&lt;br&gt;
    "permissionDecisionReason": "Hard constraint violation: no-console-log (Use logger.debug() not console.log() in TypeScript source. Production logs route through pino; console.log bypasses formatting and breaks downstream parsers.). Violated 5 times previously."&lt;br&gt;
  },&lt;br&gt;
  "violations": [&lt;br&gt;
    {&lt;br&gt;
      "rule": "no-console-log",&lt;br&gt;
      "severity": "error",&lt;br&gt;
      "violation_count": 5,&lt;br&gt;
      "is_hard": true,&lt;br&gt;
      "is_defer": false&lt;br&gt;
    }&lt;br&gt;
  ]&lt;br&gt;
}&lt;br&gt;
Rules in CLAUDE.md or AGENTS.md are advisory and the model treats them as suggestions. Rules with a violation count and an enforcement boundary at the edit step are binding. Both have the same source — a developer correcting the agent — but very different effect.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. A regression warning that flags edits to a file with a recorded bug fix&lt;/strong&gt; &lt;/p&gt;

&lt;p&gt;get_related_bugs walks decision traces and prior bug-fix facts. When validate_change runs on a file with a recorded fix, the related-bugs query surfaces the prior fix and flags the proposed change.&lt;/p&gt;

&lt;p&gt;The project has a bug fix on file world_model_server/knowledge_graph.py:120-135 for content-hash backfill (the migration logic must run on every initialize(), not just when the column is created). I proposed a refactor that removed the backfill loop and ran the related-bugs check:&lt;br&gt;
{&lt;br&gt;
  "risk_score": 0.6,&lt;br&gt;
  "bugs": [&lt;br&gt;
    {&lt;br&gt;
      "bug_id": "12457e2a-5638-46ec-a9df-02fe13b9c104",&lt;br&gt;
      "description": "Bug fix: NULL content_hash backfill must run on every initialize() to cover post-migration inserts. Earlier code only backfilled when the column was created, which left merge_from rows un-hashed and broke dedup.",&lt;br&gt;
      "fixed_at": "2026-05-10T10:17:51.737046",&lt;br&gt;
      "critical_regions": [&lt;br&gt;
        {"file": "world_model_server/knowledge_graph.py", "lines": "120-135"}&lt;br&gt;
      ]&lt;br&gt;
    }&lt;br&gt;
  ],&lt;br&gt;
  "warnings": [&lt;br&gt;
    "Lines 120-135 preserve fix for 12457e2a-5638-46ec-a9df-02fe13b9c104: Bug fix: NULL content_hash backfill must run on every initialize() to cover post-migration inserts. Earlier code only backfilled when the column was created, which left merge_from rows un-hashed and broke dedup."&lt;br&gt;
  ]&lt;br&gt;
}&lt;br&gt;
The risk score is 0.6 because the proposed change touched a critical region without re-implementing the fix. The warning text quotes the original bug description directly so the agent (or the human) can see why the region matters, not just that it does.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. A contradiction resolved by confidence + source-count weighting&lt;/strong&gt;&lt;br&gt;
The temporal layer assigns each fact a confidence score, a source_count, and a valid_at timestamp. When two facts about the same entity disagree, find_contradictions surfaces them with both sides' metadata, and resolve_contradiction picks a winner using the strategy you set.&lt;/p&gt;

&lt;p&gt;Two facts both pointing at the same entity (http_transport_port):&lt;br&gt;
{&lt;br&gt;
  "fact_a_id": "e4b2ff84-8c23-4de5-aa9e-8bbb045a4ed5",&lt;br&gt;
  "fact_b_id": "7fe854f9-d64a-4304-b43a-7d1b126c6ebb",&lt;br&gt;
  "fact_a_text": "HTTP transport listen port default is 8080",&lt;br&gt;
  "fact_b_text": "HTTP transport listen port default is 8765",&lt;br&gt;
  "similarity_score": 0.929,&lt;br&gt;
  "both_valid": true,&lt;br&gt;
  "reason": "same entity, similar text",&lt;br&gt;
  "confidence_a": 0.7,&lt;br&gt;
  "confidence_b": 0.95,&lt;br&gt;
  "source_count_a": 1,&lt;br&gt;
  "source_count_b": 3&lt;br&gt;
}&lt;br&gt;
resolve_contradiction(strategy="auto") picks the strategy with the largest signal gap. Here source count differs 3:1, so it picks keep_most_sources:&lt;br&gt;
{&lt;br&gt;
  "strategy": "keep_most_sources",&lt;br&gt;
  "winner_id": "7fe854f9-d64a-4304-b43a-7d1b126c6ebb",&lt;br&gt;
  "loser_id": "e4b2ff84-8c23-4de5-aa9e-8bbb045a4ed5",&lt;br&gt;
  "resolved_at": "2026-05-21T10:24:16.287368"&lt;br&gt;
}&lt;br&gt;
The loser is updated in place:&lt;br&gt;
{&lt;br&gt;
  "id": "e4b2ff84-8c23-4de5-aa9e-8bbb045a4ed5",&lt;br&gt;
  "fact_text": "HTTP transport listen port default is 8080",&lt;br&gt;
  "status": "superseded",&lt;br&gt;
  "invalid_at": "2026-05-21T10:24:16.285184",&lt;br&gt;
  "confidence": 0.7&lt;br&gt;
}&lt;/p&gt;

&lt;p&gt;Queries that ask "what's true now?" silently skip the superseded fact. Queries that ask "what was true on 2026-05-18?" still see it. That's what the temporal layer earns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. The PostCompact injection bundle&lt;/strong&gt;&lt;br&gt;
v0.7.0 added a PostCompact hook that re-injects the top constraints and recent canonical facts after the agent's context is compacted. The bundle is small (configurable, default ~10 constraints + 10 facts) and prioritized.&lt;/p&gt;

&lt;p&gt;The actual bundle returned by get_injection_context(event_type="PostCompact", max_constraints=5, max_facts=5):&lt;/p&gt;

&lt;h2&gt;
  
  
  Active constraints (top by violation count)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;no-console-log: Use logger.debug() not console.log() in TypeScript source. Production logs route through pino; console.log bypasses formatting and breaks downstream parsers. (violated 5x)&lt;/li&gt;
&lt;li&gt;check-twine-before-tag: Run &lt;code&gt;python3 -m twine check dist/*&lt;/code&gt; before tagging. Catches PyPI metadata errors before the tag is pushed; saves a retraction. (violated 5x)&lt;/li&gt;
&lt;li&gt;tag-before-upload: Always run git tag + git push --tags before twine upload. PyPI is permanent; an untagged upload pins a wheel to no git ref. (violated 2x)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Recent canonical facts
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Never run twine upload before git tag. Always tag, push, then upload to PyPI so the published wheel maps to a real git ref.&lt;/li&gt;
&lt;li&gt;Cursor hooks.json uses object-keyed schema with version: 1 (integer), preToolUse / preCompact / beforeSubmitPrompt event names, failClosed (not fail_open), timeout in seconds.&lt;/li&gt;
&lt;li&gt;PostCompact and UserPromptSubmit hooks emit additionalContext to splice constraints + recent facts back into agent context after compaction.&lt;/li&gt;
&lt;li&gt;HTTP transport defaults to port 8765 in Dockerfile.http; do not change without updating docs/deployment/mcp-tunnel.md and docker-compose.yml together.&lt;/li&gt;
&lt;li&gt;BetaAbstractMemoryTool subclass lives at world_model_server/memory_backend.py; required by the Anthropic SDK Managed Agents memory path.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That bundle is what gets spliced into the agent's working context as additionalContext after a compaction event. The same query also runs on UserPromptSubmit, biased toward whatever the user just asked about.&lt;/p&gt;

&lt;p&gt;The compaction audit log records what happened, queryable via the CLI:&lt;/p&gt;

&lt;p&gt;$ world-model audit-compactions --limit 5&lt;br&gt;
1 compaction audit rows&lt;br&gt;
  2026-05-21T10:38:01.606771  session=demo-session-1  pre=84320&lt;br&gt;
  post=22150  facts_injected=10  constraints_injected=3  event=PostCompact&lt;br&gt;
pre=84320, post=22150 — the compaction dropped ~62k tokens of context. The injection put 10 facts + 3 constraints back. The audit row exists so a human can later answer "what did the agent see vs what did it lose."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. A defer decision that pauses a headless agent&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;v0.7.0 added a defer enforcement tier between deny and warn. Warning-severity violations with violation_count ≥ 5 return permissionDecision: "defer" when the client advertises support, so headless agents pause instead of silently passing or hard-blocking. Clients that do not advertise support fall back to ask automatically.&lt;/p&gt;

&lt;p&gt;I have a check-twine-before-tag constraint with violation_count=5, severity=warning. When a Bash tool input matches it, the hook returns:&lt;br&gt;
{&lt;br&gt;
  "hookSpecificOutput": {&lt;br&gt;
    "hookEventName": "PreToolUse",&lt;br&gt;
    "permissionDecision": "defer",&lt;br&gt;
    "permissionDecisionReason": "Recurring warning-level violations (check-twine-before-tag). Headless agents should pause for confirmation."&lt;br&gt;
  }&lt;br&gt;
}&lt;br&gt;
Same payload, same constraint, but with supports_defer: false in the request — fall back to ask:&lt;br&gt;
{&lt;br&gt;
  "hookSpecificOutput": {&lt;br&gt;
    "hookEventName": "PreToolUse",&lt;br&gt;
    "permissionDecision": "ask",&lt;br&gt;
    "permissionDecisionReason": "Recurring warning-level violations (check-twine-before-tag). Headless agents should pause for confirmation."&lt;br&gt;
  }&lt;br&gt;
}&lt;br&gt;
The defer tier exists because the binary deny / warn choice forces you to either be too strict or too permissive. Recurring warnings that don't rise to error-level should pause for a human, not block, not pass.&lt;/p&gt;

&lt;p&gt;What this means if you are building agents&lt;/p&gt;

&lt;p&gt;The reason this works is not that the tool is clever. It is that the substrate — a temporal knowledge graph with facts, constraints, contradictions, and decision traces — captures the right shape of information.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Plain markdown rules in CLAUDE.md cannot answer:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How many times has this rule been violated?&lt;/li&gt;
&lt;li&gt;Which fact is true now vs three sessions ago?&lt;/li&gt;
&lt;li&gt;Which constraints should I re-inject after compaction?&lt;/li&gt;
&lt;li&gt;Which prior fix does this proposed change risk re-introducing?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A graph can. The cost is one MCP server, ~2,000 lines of Python, and a SQLite database that sits at ~155 KB empty (mine grew to about 2 MB after running this exercise plus the auto-seed). The payoff is a memory layer that survives compaction, enforces at the edit boundary, and tracks evidence chains back to the source.&lt;/p&gt;

&lt;p&gt;If you are building anything with Claude Code, Cursor, or any harness that supports MCP + hooks:&lt;/p&gt;

&lt;p&gt;pip install world-model-mcp&lt;br&gt;
cd /your/project&lt;br&gt;
python -m world_model_server.cli setup&lt;/p&gt;

&lt;p&gt;For Claude Managed Agents with self-hosted sandboxes (where Anthropic's built-in Memory primitive is not yet supported), v0.7.2 added streamable HTTP transport so the same 25 MCP tools also work behind an MCP tunnel.&lt;/p&gt;

&lt;p&gt;Source: github.com/SaravananJaichandar/world-model-mcp.&lt;/p&gt;

&lt;p&gt;If world-model-mcp helped you, star the repo or open an issue with what worked or didn't. I read every one.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>claude</category>
      <category>agents</category>
    </item>
  </channel>
</rss>
