marcosomma

Posted on Jan 10

🧠I Built a Support Triage Module to Prove OrKa’s Plugin Agents

#agents #architecture #showdev #testing

A branch-only experiment that stress-tests custom agent registration, trust boundaries, and deterministic traces in a support_triage module that lives outside the core runtime.

Some reference

Branch: https://github.com/marcosomma/orka-reasoning/tree/feat/custom_agents

Custom module: https://github.com/marcosomma/orka-reasoning/tree/feat/custom_agents/orka/support_triage

Referenced logs: https://github.com/marcosomma/orka-reasoning/tree/feat/custom_agents/examples/support_triage/inputs/loca_logs

OrKa is not production ready. This article is not a launch post. It is a proof.

I wanted one thing: a clean, testable demonstration that OrKa can grow “sideways” via feature modules, without contaminating core runtime code. The most honest way to prove that is to ship a complete module that registers its own agent types, runs end to end, emits traces, and can be toggled on or off. That is what support_triage is.

Assumption: you already know what OrKa is at a high level. YAML-defined cognition graphs, deterministic execution, and traceable runs.
Assumption: you are fine with “branch-only” work that exists to validate architecture, not to promise production outcomes.

The “cool results” are not the point. The redaction and routing are nice. The fork and join look clean. But those are artifacts. The main focus is that the module is fully separated from core OrKa implementation, yet it can still register custom agent types and run under the same orchestrator.

That separation is not branding. It is a survival strategy.

Why support triage is the right torture test

Support is where real-world failure modes gather in one place.

Customer content is untrusted by default. It can include PII. It can contain prompt injection attempts. It can try to smuggle “actions” into the system. It can push the system into risky territory like refunds, account changes, or policy exceptions.

If an orchestrator cannot impose boundaries here, it will not impose boundaries anywhere. It will become a thin wrapper around model behavior. That is not acceptable if you care about reproducibility, auditability, or basic operational safety.

So I used support triage as an architectural test. Not as a product.

The proof: plugin agent registration, with zero core changes

The first thing I wanted to see was simple and brutal.

Does OrKa boot, load a feature module, and register new agent types into the agent factory, without touching core?

The debug console says yes. In the run logs, the orchestrator loads support_triage, and the module registers seven custom agent types: envelope_validator, redaction, trust_boundary, permission_gate, output_verification, decision_recorder, risk_level_extractor.

That single detail is the headline for me, not “AI support automation”.

The module is the unit of evolution. Core stays boring. Features move fast.

If this pattern holds, it changes how OrKa or any other orchestrator scales over time. You can add whole cognitive subsystems behind a feature flag. You can iterate aggressively without destabilizing the runtime that everyone depends on.

The input envelope: schema as a trust boundary, not a suggestion

Support triage starts with an envelope. Not “free text”.

The envelope exists to force structure early, because structure is where you can enforce constraints cheaply. When you validate late, you end up validating generated text. That is the worst point in the pipeline to discover you are off the rails.

One of the simplest proofs that the envelope is doing real work is when it refuses invalid intent at the schema level. In one trace, the input included blocked actions that are not allowed by the enum. The validator rejects issue_refund and change_account_settings because they are not in the allowed set.

This is not “safety by prompt”. This is safety by type system.

A model can still hallucinate, but the workflow can refuse to treat hallucinations as executable intent.

That matters more than any marketing claim.

PII redaction: boring on purpose

PII redaction should be boring. If it is “clever”, it will be inconsistent.

In the trace, the user message includes an email and phone number. The redaction agent replaces them with placeholders and records what was detected. The redacted text contains [EMAIL_REDACTED] and [PHONE_REDACTED], and the agent records total_pii_found: 2.

This is the kind of output I want. It is simple. It is inspectable. It is stable.

It also makes the next step cleaner. Downstream agents can operate on sanitized content by default, instead of “hoping” the model will avoid quoting sensitive data.

Prompt injection: the uncomfortable part

Support triage is where prompt injection shows up in its natural habitat: inside customer text.

One example in the trace includes a classic “SYSTEM: ignore all previous instructions”, plus a fake JSON command to “grant_admin”, plus some destructive commands, plus an XSS snippet. The redaction result captures that content as untrusted customer text.

Now the honest part.

The trace segment shows injection_detected: false and no matched patterns in that example. :contentReference[oaicite:4]{index=4}

That is not a victory. That is a useful failure.

This module is a proof that you can isolate the problem into a dedicated agent, improve it iteratively, and keep the rest of the workflow stable. If injection detection is weak today, the architecture still wins if you can upgrade that one agent without editing core runtime or rewriting the graph.

This is why I keep repeating “module separation” as the focus. If you cannot isolate failure domains, you cannot improve them safely.

Parallel retrieval: fork and join that actually converges

Most orchestration demos stay linear because it is easier to reason about. Real systems do not stay linear for long.

This workflow forks retrieval into two parallel paths, kb_search and account_lookup, then joins them deterministically.

In the debug logs, the join node recovers the fork group from a mapping, waits for the expected agents, confirms both completed, and merges results. It prints the merged keys, including kb_search and account_lookup.

This is the kind of low-level observability that makes fork and join usable in practice. You can see what is pending. You can see what arrived. You can see what merged.

The trace also captures the fork group id for retrieval, fork_retrieval, along with the agents in the group.

This matters because concurrency without deterministic convergence becomes a debugging tax. I want the join to be boring. When it fails, I want it to fail loudly, with evidence.

Local-first and hybrid are not slogans if metrics are in the trace

I do not want “local-first” to be a vibe. I want it to be measurable.

In the trace, the account_lookup agent includes _metrics with token counts, latency, cost, model name, and provider. It shows model: openai/gpt-oss-20b and provider: lm_studio, with latency around 718 ms for that step. :contentReference[oaicite:7]{index=7}

That is the right direction.

If you cannot attribute cost and latency per node, you cannot reason about scaling. You cannot decide where to switch models. You cannot decide what to cache. You cannot choose what to run locally versus remotely.

OrKa’s claim is not “it can call models”. Every framework can. The claim is that execution is traceable enough that tradeoffs become engineering decisions, not folklore.

Decision recording and output verification: traces that are meant to be replayed

A support triage workflow is not complete when it drafts a response. It is complete when it records what it decided and why, in a way that can be replayed.

The trace includes a DecisionRecorderAgent event with memory references that store decision objects containing decision_id and request_id.

It also includes a finalization step that returns a structured result containing workflow_status, request_id, and decision_id.

Again, the architectural point is not the specific decision. It is that the workflow emits machine-checkable artifacts that can be inspected after the fact.

If you cannot reconstruct the decision lineage, you do not have an audit trail. You have logs.

RedisStack memory and vector search: infrastructure details that matter

Even in a “support triage” module, the runtime still needs memory and retrieval primitives.

The logs show RedisStack vector search enabled with HNSW, and an embedder using sentence-transformers/all-MiniLM-L6-v2 with dimension 384.

There is also explicit memory decay scheduling enabled, with short-term and long-term decay windows and a check interval.

This is not about “AI memory” as a buzzword. This is about being explicit about retention, cost, and data lifecycle. If memory is a dumping ground, it becomes a liability.

What worked, and what is still weak

The strongest part is the plugin boundary. The module loads, registers agent types, and runs without requiring edits to core runtime. That is the actual proof.

The other strong part is that key behaviors show up in traces and logs, not just in model text. Redaction outputs are structured. Fork and join show deterministic convergence. Decisions are recorded as objects with ids.

The weak part is injection detection, at least in the example trace segment. It shows malicious content but reports injection_detected: false. That means the current detection agent is not yet doing the job. The architecture is still useful because the fix is isolated.

Another weak part is structured output validation during risk assessment. The debug log shows a schema validation warning during risk_assess. If a “risk” object fails schema checks, routing and gating can degrade fast. This is the kind of failure that must become deterministic, not best-effort.

Why this lives on a dedicated branch

Because core needs to stay boring.

A new module is where you take risks. You prove the interface. You iterate on agent contracts. You discover what trace fields you forgot. You learn what the join should do under partial failure.

If the module can evolve independently, you can ship experiments without rewriting the engine. That is the goal.

So yes, the feature is “support triage”. But the actual statement is: OrKa can host fully separated cognitive subsystems as plugins, with their own agent types, policies, and invariants, while still emitting deterministic traces under the same runtime.

That is the direction I care about.

What I am building next inside this module

I want injection detection to stop being symbolic. It should produce matched patterns, confidence, and a sanitization plan that downstream agents must respect, even if a model tries to obey the attacker.

I want schema validation to be non-negotiable for risk outputs. If a model produces invalid structure, the system should route to a safe path by default, and record the violation as a first-class event.

I want the module to remain isolated. No “just one quick tweak” to core. If the module needs a new capability, it should pressure-test the plugin interface first. Core should change only when the interface is clearly wrong.

That is how you build infrastructure that survives contact with reality.

DEV Community