Mak Sò

Posted on Aug 10

🧠OrKa onboarding: example suite, guided tour, and trace replay

#ai #machinelearning #opensource #deeplearning

Engineers do not need more hype. We need runnable flows, visible state, and contracts that survive contact with production. This guide shows how OrKa lowers the entrance barrier with a concrete example suite, a guided tour in OrKa UI, and full trace replay so you can understand every step. It includes direct links to the live docs and examples so you can go hands on right away.

TLDR

Run your first OrKa workflow in minutes using the example suite.
Start a guided tour inside OrKa UI that explains each panel using a real trace.
Inspect traces, memory writes, merge strategies, and resolved prompts.
Everything is backed by contracts and real repositories. See the docs and examples:
- Docs folder: https://github.com/marcosomma/orka-reasoning/tree/master/docs
- Example catalog: https://github.com/marcosomma/orka-reasoning/tree/master/examples
- Socratic example: https://github.com/marcosomma/orka-reasoning/tree/master/examples/orka_soc

Why this matters

Most AI projects start with a single model call and a chat box. It feels great for a demo and falls apart once you need decisions, parallel work, memory, or observability. You end up copy pasting patterns and stitching logs that no one can read. That is a tax on every team.

OrKa is built to remove that tax. You define cognition as a graph. Agents do work. Nodes steer control. The orchestrator executes the graph and emits a trace. OrKa UI reads the trace and lets you replay the run, inspect node outputs, and see how memory changes over time. A guided tour now sits on top of the UI so new users can learn by following a short path that uses a real trace, not static screenshots.

If you are a developer, this is the kind of onboarding that respects your time. No mystery, no ceremony. You run a small example, you watch it execute, and you inspect the truth of what happened.

The mental model in one minute

An orchestrator executes a graph defined in YAML. Strategies include sequential flow, conditional routing, fork group, failover, and loop.
A node is a control point in the graph. A node may run an agent or route execution without running a model at all.
An agent is a unit of work. Builders return structured outputs. Classifiers return labels and confidences. RAG agents retrieve context before building an answer.
Memory is a first class resource. Entries live in namespaces and decay over time. Importance scores extend TTL within limits.
A trace is a structured artifact. It records node order, timing, tokens, cost, inputs, outputs, and memory side effects. OrKa UI replays traces and renders panels from them.

You can understand any run by reading its trace or by replaying it in the UI. The guided tour simply points you to the right panels in the right order.

The example suite you can actually run

The example suite covers common production patterns. Each folder contains an orka.yaml, a default input in inputs/, and a golden trace in traces/. You can run the flow with a single command, then import the trace into OrKa UI and start the guided tour.

00. Hello flow

The smallest flow proves that your environment works. One builder agent returns a greeting and writes a memory entry. You get a clean trace with a single node execution.

type: sequential
id: hello_flow
nodes:
  - id: echo_builder
    type: builder
    prompt: |
      You return a JSON object with a single field "message" greeting {{ input.name | default("world") }}.
    output_schema:
      type: object
      properties:
        message: { type: string }
  - id: done
    type: join_node

Run it:

orka run examples/00-hello-flow -i '{"name":"OrKa"}' --export-trace 'examples/00-hello-flow/traces/latest.json'

Open OrKa UI and import latest.json. This should be painless. If not, fix the basics before moving on.

01. Router cookbook

This example demonstrates conditional routing with a small but realistic rule set. The classification agent emits a label with confidence. The router chooses the next node based on rules you can read and change.

type: conditional
id: router_cookbook
nodes:
  - id: classify
    type: classification
    labels: [tech, policy, other]
  - id: router
    type: router
    strategy: confidence_weighted
    routes:
      - when: label == 'tech' and confidence > 0.6
        next: tech_builder
      - when: label == 'policy'
        next: policy_builder
      - when: true
        next: fallback
  - id: tech_builder
    type: builder
    prompt: "Summarize technical aspects in 3 bullets."
  - id: policy_builder
    type: builder
    prompt: "Outline governance and risk implications in 3 bullets."
  - id: fallback
    type: builder
    prompt: "Say you are unsure and ask one clarifying question."
  - id: join
    type: join_node

In OrKa UI, the router node shows evaluated values for label and confidence, the rule that matched, and the selected path. This removes guessing. You can nudge the confidence threshold and watch the path flip.

02. Fork and join

Parallel paths are ideal when you need independent views on the same input. This example runs an extractor and a critic in parallel, then merges results.

type: fork_group
id: fork_join
nodes:
  - id: fork
    type: fork_group
    agents: [extract, critique]
  - id: extract
    type: builder
    prompt: "Extract key facts as a JSON array of {fact, evidence}."
  - id: critique
    type: validation_and_structuring
    rules: ["no speculation", "include evidence"]
  - id: join
    type: join_node
    merge: concat

The UI renders the fork group with two child nodes. The join panel shows the merge strategy and the raw outputs from children. Swap concat with first_success and rerun to see failure handling.

03. RAG with memory decay

State without decay turns into an attic. This example shows retrieval augmented generation backed by a memory store with namespaces, vector search, and decay based on importance.

type: sequential
id: rag_memory_decay
resources:
  memory:
    backend: redisstack
    vector_search_enabled: true
    memory_decay_config:
      enabled: true
      short_term_hours: 2
      long_term_hours: 168
nodes:
  - id: rag
    type: rag_node
    k: 5
    namespace: getting_started
  - id: memory_writer
    type: memory_writer
    importance_score: 0.7
    category: stored
  - id: answer
    type: builder
    prompt: |
      Using retrieved docs and short term context, answer succinctly.

Watch the memory overlay. You will see entries with namespace, importance, and the computed expiration. Important entries live longer by design. This is how you prevent stale context from running the show.

04. Failover and retry

Agents and networks fail. This example retries a fast local model with short backoff, then falls to a more robust model.

type: failover
id: failover_retry
nodes:
  - id: primary
    type: local_llm_agent
    model: fast
    retry: { attempts: 2, backoff_ms: 250 }
  - id: secondary
    type: local_llm_agent
    model: robust
  - id: join
    type: join_node

In a failing run you will see multiple attempts and their latencies. Cost accounting is per attempt and in total. This makes performance and economics visible.

05. Loop with a budget guard

Refinement loops are powerful and risky. You need a ceiling for loops, tokens, and time. This example runs a small debate, scores agreement, and either continues or summarizes.

type: loop
id: loop_budget
limits: { max_loops: 3, max_tokens: 3000, max_ms: 15000 }
nodes:
  - id: debate
    type: loop_node
    inner: fork_group
    agents: [progressive, conservative, realist, purist]
  - id: score
    type: builder
    prompt: "Return agreement_score float in [0,1]."
  - id: stop_router
    type: router
    routes:
      - when: previous_outputs.score.agreement_score >= 0.65
        next: summarize
      - when: true
        next: debate
  - id: summarize
    type: builder
    prompt: "Synthesize common ground in one paragraph."

In the trace you can watch the loop evolve. This helps you tune thresholds and loop counts before you scale up.

06. UI replay with a guided tour

This example ships a golden trace and a small README that launches the guided tour. One click opens the trace. One more starts the tour. This is perfect for onboarding sessions or a quick internal demo.

The guided tour that teaches with real data

The OrKa UI tour is not a list of tooltips. It is a four step learning path bound to a real trace. Start the tour from an example page and it will load the matching trace and highlight the parts that matter.

Execution path and timing. The tour focuses the graph and the executed nodes. You see ordering, status, and per node latency. You learn how the orchestrator advanced.
Memory overlay. The tour opens memory and shows recent writes, TTL, importance, and namespace filters. You learn what the flow persisted and for how long.
Resolved prompts. The tour opens the template viewer for the current node. You see the exact prompt after the system injected variables and previous outputs. You learn how data flows between agents.
Cost and tokens. The tour opens counters and explains totals vs node level numbers. You learn the economics of the run and where time was spent.

The point is to map the mental model to the data. Once you run a few tours this way, the UI becomes predictable and you will not need the tour at all.

Contracts that keep names honest

The runtime emits JSON with strict shapes. The UI consumes those shapes. This keeps names honest across layers.

Minimal shapes

TraceEvent

{
  "trace_id": "uuid",
  "engine_version": "0.8.0",
  "schema_version": "1.1",
  "events": [
    {
      "timestamp": "2025-08-05T18:22:11Z",
      "agent_id": "extract",
      "node_type": "builder",
      "input": {"text": "..." },
      "output": {"result": "...", "status": "ok"},
      "tokens": {"in": 123, "out": 456},
      "cost": {"usd": 0.0012},
      "latency_ms": 842,
      "memory_writes": [
        {"namespace":"getting_started","category":"stored","ttl_hours": 24,"importance":0.7}
      ]
    }
  ]
}

AgentOutput

{
  "result": "...",
  "status": "ok",
  "error": null,
  "metadata": { "schema": "v1" }
}

MemoryEntry

{
  "namespace": "getting_started",
  "category": "stored",
  "importance": 0.7,
  "vector": true,
  "created_at": "2025-08-05T18:22:12Z",
  "ttl_hours": 24,
  "expires_at": "2025-08-06T18:22:12Z"
}

ExecutionMetadata

{
  "run_id": "uuid",
  "engine_version": "0.8.0",
  "schema_version": "1.1",
  "memory_backend": "redisstack",
  "total_tokens": 579,
  "total_cost_usd": 0.0039
}

These are the bones. The full contracts live in the docs. The key idea is that once a field name appears in a trace, the UI must either render it or clearly ignore it. There is no silent mismatch.

Rules of thumb for stability

Add engine_version and schema_version to every trace.
Keep breaking changes behind new schema versions and provide a map for older traces.
Treat resolved prompts as test fixtures for agents that control money or risk.
Include memory writes in traces even when they are optional at runtime. Observability beats minimalism here.

Step by step onboarding in five minutes

You can adjust the commands to your setup, but the shape of the workflow is consistent.

# 1. Clone the repository
git clone https://github.com/marcosomma/orka-reasoning

# 2. Pick a small example
cd orka-reasoning/examples/00-hello-flow

# 3. Run with default input and export a trace
orka run . -i '@inputs/default.json' --export-trace './traces/latest.json'

# 4. Open OrKa UI and import the trace
# Start the guided tour from the example page

You can repeat the same pattern for the router example, the fork and join example, and the loop with a budget guard. The more you run and replay, the faster the system will click.

Troubleshooting checklist

If something feels off, use this short checklist before you lose time.

The UI import fails. Confirm that schema_version in your trace matches the UI version. If it is ahead, update the UI. If it is behind, use the version map from the docs.
The router does not take the path you expect. Inspect the router node panel and read the evaluated label and confidence. Raise the threshold if the model is noisy.
A join produces a weird shape. Open the join panel and switch merge strategy from concat to schema_merge or first_success. Rerun and compare.
Memory feels sticky. Open the overlay and check namespaces and decay settings. Maybe you did not tag entries correctly.
Costs look high. Open counters and see which node dominates. Inspect resolved prompts for that node. You might be sending more context than needed.

Short feedback loops save days.

Pragmatic adoption inside an existing codebase

You do not need to migrate everything. Start small and win fast.

Wrap a single task that benefits from visibility. A classifier with a router is perfect.
Keep the first YAML under 30 lines. There is no prize for density.
Export traces as CI artifacts. Replaying failed runs will pay for the setup many times over.
Use namespaces per product area. Treat memory like a database that deserves schema and hygiene.
Add budget guards early and wire their results into alerts. A loop that silently runs forever is not a feature.
Freeze prompts that touch money or safety. Keep them as fixtures and add lightweight unit tests that render the resolved prompt and validate structure.

These habits make OrKa feel like a normal part of your stack rather than a special experiment.

Community notes and collaborative docs

Not all of the documentation is written by me. There are community contributed docs that explain how other teams think about modular cognition, the role of decay, and why confidence distributions are better than single thresholds in some cases. This is intentional. OrKa is infrastructure, not a script. Multiple voices keep the philosophy honest and the contracts clear.

The best entry points for context and deeper reading live here:

Docs folder: https://github.com/marcosomma/orka-reasoning/tree/master/docs
Example catalog: https://github.com/marcosomma/orka-reasoning/tree/master/examples
Socratic example: https://github.com/marcosomma/orka-reasoning/tree/master/examples/orka_soc

If something feels confusing or under specified, open an issue with a trace snippet and a short description of the gap. Contracts improve fastest when anchored to real runs.

Where this goes next

Onboarding is a journey, not a single release. The baseline is in place. The next set of improvements is already scoped.

More example folders that combine RAG, structured outputs, and validation.
A router that emits a full confidence distribution and supports probabilistic branching when you want exploration.
Schema version maps checked into the repo with tests.
Trace diffs in OrKa UI to compare runs and spot regressions.
Deeper docs on memory, including guidance for setting importance and pruning strategies for long lived systems.
A simple recipe to swap memory and queue backends without editing YAML.

If you care about any of these, join the discussions in the repo. The best ideas tend to come from people who depend on the system every day.

Closing

Onboarding should not be a maze. It should be a runway. The example suite gives you speed. The guided tour in OrKa UI gives you clarity. The trace gives you truth. Use them together and you will understand OrKa in one afternoon, not one week.

Clone the repo. Run an example. Import the trace. Start the tour. Then decide if you want to route, fork, add memory, or loop with guards. You are in control and you can see what is happening.

If this guide saved you time, copy sections into your internal wiki and adapt the YAML to your use cases. If something broke, send a trace and we will fix the contract or the docs.

Happy shipping.

DEV Community