Engineers do not need more hype. We need runnable flows, visible state, and contracts that survive contact with production. This guide shows how OrKa lowers the entrance barrier with a concrete example suite, a guided tour in OrKa UI, and full trace replay so you can understand every step. It includes direct links to the live docs and examples so you can go hands on right away.
TLDR
- Run your first OrKa workflow in minutes using the example suite.
- Start a guided tour inside OrKa UI that explains each panel using a real trace.
- Inspect traces, memory writes, merge strategies, and resolved prompts.
- Everything is backed by contracts and real repositories. See the docs and examples:
- Docs folder: https://github.com/marcosomma/orka-reasoning/tree/master/docs
- Example catalog: https://github.com/marcosomma/orka-reasoning/tree/master/examples
- Socratic example: https://github.com/marcosomma/orka-reasoning/tree/master/examples/orka_soc
Why this matters
Most AI projects start with a single model call and a chat box. It feels great for a demo and falls apart once you need decisions, parallel work, memory, or observability. You end up copy pasting patterns and stitching logs that no one can read. That is a tax on every team.
OrKa is built to remove that tax. You define cognition as a graph. Agents do work. Nodes steer control. The orchestrator executes the graph and emits a trace. OrKa UI reads the trace and lets you replay the run, inspect node outputs, and see how memory changes over time. A guided tour now sits on top of the UI so new users can learn by following a short path that uses a real trace, not static screenshots.
If you are a developer, this is the kind of onboarding that respects your time. No mystery, no ceremony. You run a small example, you watch it execute, and you inspect the truth of what happened.
The mental model in one minute
- An orchestrator executes a graph defined in YAML. Strategies include sequential flow, conditional routing, fork group, failover, and loop.
- A node is a control point in the graph. A node may run an agent or route execution without running a model at all.
- An agent is a unit of work. Builders return structured outputs. Classifiers return labels and confidences. RAG agents retrieve context before building an answer.
- Memory is a first class resource. Entries live in namespaces and decay over time. Importance scores extend TTL within limits.
- A trace is a structured artifact. It records node order, timing, tokens, cost, inputs, outputs, and memory side effects. OrKa UI replays traces and renders panels from them.
You can understand any run by reading its trace or by replaying it in the UI. The guided tour simply points you to the right panels in the right order.
The example suite you can actually run
The example suite covers common production patterns. Each folder contains an orka.yaml
, a default input in inputs/
, and a golden trace in traces/
. You can run the flow with a single command, then import the trace into OrKa UI and start the guided tour.
00. Hello flow
The smallest flow proves that your environment works. One builder agent returns a greeting and writes a memory entry. You get a clean trace with a single node execution.
type: sequential
id: hello_flow
nodes:
- id: echo_builder
type: builder
prompt: |
You return a JSON object with a single field "message" greeting {{ input.name | default("world") }}.
output_schema:
type: object
properties:
message: { type: string }
- id: done
type: join_node
Run it:
orka run examples/00-hello-flow -i '{"name":"OrKa"}' --export-trace 'examples/00-hello-flow/traces/latest.json'
Open OrKa UI and import latest.json
. This should be painless. If not, fix the basics before moving on.
01. Router cookbook
This example demonstrates conditional routing with a small but realistic rule set. The classification agent emits a label with confidence. The router chooses the next node based on rules you can read and change.
type: conditional
id: router_cookbook
nodes:
- id: classify
type: classification
labels: [tech, policy, other]
- id: router
type: router
strategy: confidence_weighted
routes:
- when: label == 'tech' and confidence > 0.6
next: tech_builder
- when: label == 'policy'
next: policy_builder
- when: true
next: fallback
- id: tech_builder
type: builder
prompt: "Summarize technical aspects in 3 bullets."
- id: policy_builder
type: builder
prompt: "Outline governance and risk implications in 3 bullets."
- id: fallback
type: builder
prompt: "Say you are unsure and ask one clarifying question."
- id: join
type: join_node
In OrKa UI, the router node shows evaluated values for label
and confidence
, the rule that matched, and the selected path. This removes guessing. You can nudge the confidence threshold and watch the path flip.
02. Fork and join
Parallel paths are ideal when you need independent views on the same input. This example runs an extractor and a critic in parallel, then merges results.
type: fork_group
id: fork_join
nodes:
- id: fork
type: fork_group
agents: [extract, critique]
- id: extract
type: builder
prompt: "Extract key facts as a JSON array of {fact, evidence}."
- id: critique
type: validation_and_structuring
rules: ["no speculation", "include evidence"]
- id: join
type: join_node
merge: concat
The UI renders the fork group with two child nodes. The join panel shows the merge strategy and the raw outputs from children. Swap concat
with first_success
and rerun to see failure handling.
03. RAG with memory decay
State without decay turns into an attic. This example shows retrieval augmented generation backed by a memory store with namespaces, vector search, and decay based on importance.
type: sequential
id: rag_memory_decay
resources:
memory:
backend: redisstack
vector_search_enabled: true
memory_decay_config:
enabled: true
short_term_hours: 2
long_term_hours: 168
nodes:
- id: rag
type: rag_node
k: 5
namespace: getting_started
- id: memory_writer
type: memory_writer
importance_score: 0.7
category: stored
- id: answer
type: builder
prompt: |
Using retrieved docs and short term context, answer succinctly.
Watch the memory overlay. You will see entries with namespace
, importance
, and the computed expiration. Important entries live longer by design. This is how you prevent stale context from running the show.
04. Failover and retry
Agents and networks fail. This example retries a fast local model with short backoff, then falls to a more robust model.
type: failover
id: failover_retry
nodes:
- id: primary
type: local_llm_agent
model: fast
retry: { attempts: 2, backoff_ms: 250 }
- id: secondary
type: local_llm_agent
model: robust
- id: join
type: join_node
In a failing run you will see multiple attempts and their latencies. Cost accounting is per attempt and in total. This makes performance and economics visible.
05. Loop with a budget guard
Refinement loops are powerful and risky. You need a ceiling for loops, tokens, and time. This example runs a small debate, scores agreement, and either continues or summarizes.
type: loop
id: loop_budget
limits: { max_loops: 3, max_tokens: 3000, max_ms: 15000 }
nodes:
- id: debate
type: loop_node
inner: fork_group
agents: [progressive, conservative, realist, purist]
- id: score
type: builder
prompt: "Return agreement_score float in [0,1]."
- id: stop_router
type: router
routes:
- when: previous_outputs.score.agreement_score >= 0.65
next: summarize
- when: true
next: debate
- id: summarize
type: builder
prompt: "Synthesize common ground in one paragraph."
In the trace you can watch the loop evolve. This helps you tune thresholds and loop counts before you scale up.
06. UI replay with a guided tour
This example ships a golden trace and a small README that launches the guided tour. One click opens the trace. One more starts the tour. This is perfect for onboarding sessions or a quick internal demo.
The guided tour that teaches with real data
The OrKa UI tour is not a list of tooltips. It is a four step learning path bound to a real trace. Start the tour from an example page and it will load the matching trace and highlight the parts that matter.
- Execution path and timing. The tour focuses the graph and the executed nodes. You see ordering, status, and per node latency. You learn how the orchestrator advanced.
- Memory overlay. The tour opens memory and shows recent writes, TTL, importance, and namespace filters. You learn what the flow persisted and for how long.
- Resolved prompts. The tour opens the template viewer for the current node. You see the exact prompt after the system injected variables and previous outputs. You learn how data flows between agents.
- Cost and tokens. The tour opens counters and explains totals vs node level numbers. You learn the economics of the run and where time was spent.
The point is to map the mental model to the data. Once you run a few tours this way, the UI becomes predictable and you will not need the tour at all.
Contracts that keep names honest
The runtime emits JSON with strict shapes. The UI consumes those shapes. This keeps names honest across layers.
Minimal shapes
TraceEvent
{
"trace_id": "uuid",
"engine_version": "0.8.0",
"schema_version": "1.1",
"events": [
{
"timestamp": "2025-08-05T18:22:11Z",
"agent_id": "extract",
"node_type": "builder",
"input": {"text": "..." },
"output": {"result": "...", "status": "ok"},
"tokens": {"in": 123, "out": 456},
"cost": {"usd": 0.0012},
"latency_ms": 842,
"memory_writes": [
{"namespace":"getting_started","category":"stored","ttl_hours": 24,"importance":0.7}
]
}
]
}
AgentOutput
{
"result": "...",
"status": "ok",
"error": null,
"metadata": { "schema": "v1" }
}
MemoryEntry
{
"namespace": "getting_started",
"category": "stored",
"importance": 0.7,
"vector": true,
"created_at": "2025-08-05T18:22:12Z",
"ttl_hours": 24,
"expires_at": "2025-08-06T18:22:12Z"
}
ExecutionMetadata
{
"run_id": "uuid",
"engine_version": "0.8.0",
"schema_version": "1.1",
"memory_backend": "redisstack",
"total_tokens": 579,
"total_cost_usd": 0.0039
}
These are the bones. The full contracts live in the docs. The key idea is that once a field name appears in a trace, the UI must either render it or clearly ignore it. There is no silent mismatch.
Rules of thumb for stability
- Add
engine_version
andschema_version
to every trace. - Keep breaking changes behind new schema versions and provide a map for older traces.
- Treat resolved prompts as test fixtures for agents that control money or risk.
- Include memory writes in traces even when they are optional at runtime. Observability beats minimalism here.
Step by step onboarding in five minutes
You can adjust the commands to your setup, but the shape of the workflow is consistent.
# 1. Clone the repository
git clone https://github.com/marcosomma/orka-reasoning
# 2. Pick a small example
cd orka-reasoning/examples/00-hello-flow
# 3. Run with default input and export a trace
orka run . -i '@inputs/default.json' --export-trace './traces/latest.json'
# 4. Open OrKa UI and import the trace
# Start the guided tour from the example page
You can repeat the same pattern for the router example, the fork and join example, and the loop with a budget guard. The more you run and replay, the faster the system will click.
Troubleshooting checklist
If something feels off, use this short checklist before you lose time.
- The UI import fails. Confirm that
schema_version
in your trace matches the UI version. If it is ahead, update the UI. If it is behind, use the version map from the docs. - The router does not take the path you expect. Inspect the router node panel and read the evaluated
label
andconfidence
. Raise the threshold if the model is noisy. - A join produces a weird shape. Open the join panel and switch merge strategy from
concat
toschema_merge
orfirst_success
. Rerun and compare. - Memory feels sticky. Open the overlay and check namespaces and decay settings. Maybe you did not tag entries correctly.
- Costs look high. Open counters and see which node dominates. Inspect resolved prompts for that node. You might be sending more context than needed.
Short feedback loops save days.
Pragmatic adoption inside an existing codebase
You do not need to migrate everything. Start small and win fast.
- Wrap a single task that benefits from visibility. A classifier with a router is perfect.
- Keep the first YAML under 30 lines. There is no prize for density.
- Export traces as CI artifacts. Replaying failed runs will pay for the setup many times over.
- Use namespaces per product area. Treat memory like a database that deserves schema and hygiene.
- Add budget guards early and wire their results into alerts. A loop that silently runs forever is not a feature.
- Freeze prompts that touch money or safety. Keep them as fixtures and add lightweight unit tests that render the resolved prompt and validate structure.
These habits make OrKa feel like a normal part of your stack rather than a special experiment.
Community notes and collaborative docs
Not all of the documentation is written by me. There are community contributed docs that explain how other teams think about modular cognition, the role of decay, and why confidence distributions are better than single thresholds in some cases. This is intentional. OrKa is infrastructure, not a script. Multiple voices keep the philosophy honest and the contracts clear.
The best entry points for context and deeper reading live here:
- Docs folder: https://github.com/marcosomma/orka-reasoning/tree/master/docs
- Example catalog: https://github.com/marcosomma/orka-reasoning/tree/master/examples
- Socratic example: https://github.com/marcosomma/orka-reasoning/tree/master/examples/orka_soc
If something feels confusing or under specified, open an issue with a trace snippet and a short description of the gap. Contracts improve fastest when anchored to real runs.
Where this goes next
Onboarding is a journey, not a single release. The baseline is in place. The next set of improvements is already scoped.
- More example folders that combine RAG, structured outputs, and validation.
- A router that emits a full confidence distribution and supports probabilistic branching when you want exploration.
- Schema version maps checked into the repo with tests.
- Trace diffs in OrKa UI to compare runs and spot regressions.
- Deeper docs on memory, including guidance for setting importance and pruning strategies for long lived systems.
- A simple recipe to swap memory and queue backends without editing YAML.
If you care about any of these, join the discussions in the repo. The best ideas tend to come from people who depend on the system every day.
Closing
Onboarding should not be a maze. It should be a runway. The example suite gives you speed. The guided tour in OrKa UI gives you clarity. The trace gives you truth. Use them together and you will understand OrKa in one afternoon, not one week.
Clone the repo. Run an example. Import the trace. Start the tour. Then decide if you want to route, fork, add memory, or loop with guards. You are in control and you can see what is happening.
If this guide saved you time, copy sections into your internal wiki and adapt the YAML to your use cases. If something broke, send a trace and we will fix the contract or the docs.
Happy shipping.
Top comments (0)