DEV Community: San

Using subagents as parallel perspectives on design decision

San — Mon, 13 Apr 2026 13:43:28 +0000

Most of us use Claude Code subagents for parallel work — search this,
refactor that, concurrent execution. Lately I've been using them for
something different: parallel perspectives on the same problem.

If you've discussed design decisions with the main agent long enough, you've hit this: the agent gets buried in the context, or the conversation grows long and it just starts agreeing with you. Either way, fresh angles stop coming.

Subagents don't share that context. They come in cold — no history of the conversation, no accumulated framing — and that's exactly why they work here.

The pattern

The trigger for me was designing the commands for a lightweight AI harness called Scaff I've been building as a side project.

I'd frequently hit walls while discussing the design with the main agent. When I asked for its take, it either couldn't break out of the framing it had built up so far, or it just drifted toward accommodating whatever I wanted.

So I asked the main agent to do this:

"Log the current discussion to docs/discussion/<topic>.md."
"Spawn 3 subagents with different perspectives — LLM, software architect, end-user. Write each take to ...-round-1-<perspective>.md."

This produces:

   docs/discussion/<topic>-round-1-llm.md
   docs/discussion/<topic>-round-1-architect.md
   docs/discussion/<topic>-round-1-enduser.md

Resume the discussion with the main agent using the round-1 files as fresh inputs.
If it converges, ask the main agent to update the original topic doc.
If open issues remain, repeat the same cycle as ...-round-2.md, ...-round-3.md — until it actually closes.

Three perspectives

The key is the three parallel perspectives in step 2.

LLM perspective: "Is this easy for the model to parse?"
Architect perspective: "Will this design still hold up in a year?"
End-user perspective: "Can I understand this in five minutes?"

The three subagents come at the problem from genuinely different angles. More detail below.

The side effect is great too.

The ...-round-1.md, ...-round-2.md files persist as files, so the discussion doesn't evaporate from the conversation — it accumulates as documentation. Three days later when you ask "wait, why did I settle on this?" — the answer is right there.

Leaving a trace of decisions — that alone makes the pattern worth using.

How it played out

Here's how it played out on one design question. I can't include everything, but I've excerpted the moments that were most interesting.

Topic: Should we introduce a separate command for loading OVERVIEW.md?

User: What if the /scaff:scout command auto-loaded OVERVIEW.md — a project-level overview doc — on every session start? What do you think?

Main agent: Good direction. Having the overview available from the start would give better architectural grounding for every task.

User: But scout is often the starting point for a work session, and OVERVIEW.md can get long. End users might feel Scaff is heavier than its philosophy suggests. What about a dedicated /scaff:overview load so it's only pulled in when needed?

Main agent: Good point. A /scaff:overview load subcommand that loads on demand would help save tokens.

User: But for most other commands that work fine without it, that's extra token cost.

Main agent: If we just give /scaff:scout an affordance — let it know docs/OVERVIEW.md exists and let the agent decide whether to read it — that would solve the unnecessary token cost.

User: But in my experience, the LLM will be biased toward reading OVERVIEW.md anyway.

User: Let's hear from the subagents. Log the current discussion to docs/discussion/<topic>.md. Then spawn LLM / software architect / end-user subagents and write each take to ...-round-1-<perspective>.md.

LLM perspective

The bias is real, and the user has correctly diagnosed it. LLMs trained on helpfulness are systematically biased toward "more context = better answer." The model's internal cost function treats a false negative (missed read) as far more painful than a false positive (unnecessary read). Skipping feels like negligence, reading feels like diligence.

(...it walked through concrete examples and estimated the soft rule would fire on 6–8 out of 10 typical invocations.)

Gating must key off literal tokens in the user's input, not the model's interpretation of task semantics. Interpretation always drifts toward "read more."

Architect perspective

The architect didn't touch execution at all. It went after the document structure: CONTEXT.md is "what's being worked on right now," OVERVIEW.md is "big picture that rarely changes" — but read literally, that's exactly what you'd want at session start.

(...it re-diagnosed this as a definitional collision, not a loading problem.)

If OVERVIEW.md cannot answer "who reads me, when, and why," the fix is to delete the role, not to invent a loader for it.

Same topic, completely different layers. The LLM pointed at execution bias. The architect pointed at "should this document even exist?"

End-user perspective

The end-user perspective came in from yet another angle — user behavior:

A soft rule like "read OVERVIEW.md when the task touches architecture" sounds disciplined on paper, but in practice the LLM's threshold for "touches architecture" is fuzzy and self-serving — when in doubt, it reads.

(...after honestly checking how often users actually ask "what's the big picture?" mid-workflow — rarely — it concluded that frequency doesn't justify automation.)

Reject the soft rule entirely. That phrasing is exactly the fuzzy trigger the user is warning about, and it will collapse into always-load within a week of real use.

What converged

After the three perspectives came back, I resumed with the main agent. The architect's "role collision" diagnosis turned out to be a naming collision — CONTEXT.md's first heading was # Project Overview, which made the two docs look like they overlapped. Renaming it to # Working Context fixed it. Subagents can misdiagnose too, but even that surfaced the real issue.

The soft rule was scrapped. All three perspectives rejected it, and I agreed. OVERVIEW.md loading switched to reactive triggers — it only suggests loading when specific events fire, not on every session. The user decides.

Then a new problem: where does this reactive-trigger principle live? The main agent suggested scaff-subagent, but that skill is specifically for subagent delegation. Deciding when to read OVERVIEW.md is main-agent workflow, not subagent work. So we created the scaff-flow skill.

Once scaff-flow existed, we noticed document sync guidelines scattered across individual command files — when to suggest /scaff:design sync, when to suggest /scaff:context sync — had the same character. Those went into scaff-flow too. In the end, scaff-flow became a collection of principles for the main agent to autonomously drive a scaff project — a skill that should work better as AI improves.

The original discussion was "when should OVERVIEW.md be read?" It ended with a new skill that collects main-agent workflow principles — and should age well as models get better at autonomous decisions.

Scaff is on GitHub if you want to look around.

Building Real-time Web Applications with PynneX and FastAPI

San — Sun, 02 Feb 2025 07:35:23 +0000

Building Real-time Web Applications with PynneX and FastAPI

I've created three examples demonstrating how to build real-time web applications using Python worker threads and event-driven patterns. Rather than fully replacing established solutions like Celery or Redis, this approach aims to offer a lighter alternative for scenarios where distributed task queues may be overkill. No locks, no manual concurrency headaches — just emitters in the worker and listeners on the main thread or other workers.

Why PynneX?

While there are several solutions for handling concurrent tasks in Python, each comes with its own trade-offs:

Celery: Powerful for distributed tasks but might be overkill for simpler scenarios
Redis: Great as an in-memory data store, though it adds external dependencies
RxPY: Comprehensive reactive programming but has a steeper learning curve
asyncio.Queue: Basic but needs manual implementation of high-level patterns
Qt's Signals & Slots: xcellent pattern but tied to GUI frameworks

PynneX takes the proven emitter-listener pattern and makes it seamlessly work with asyncio for general Python applications:

Lightweight: No external dependencies beyond Python stdlib
Focused: Designed specifically for thread-safe communication between threads
Simple: Clean and intuitive through declarative event handling
Flexible: Not tied to any UI framework or architecture

For simpler scenarios where you just need clean thread communication without distributed task queues, PynneX provides a lightweight alternative.

🍓 1. Berry Checker (Basic)

A minimal example showing the core concepts:

Worker thread for background processing
WebSocket real-time updates
Event-driven task handling

Demo

View Code

📱 2. QR Code Generator (Intermediate)

Building on the basic concepts and adding:

Real-time image generation
Base64 image encoding/decoding
Clean Controller-Worker pattern

Thread safety comes for free: the worker generates QR codes and emits them, the main thread listens and updates the UI. No manual synchronization needed.

Demo

View Code

📈 3. Stock Monitor (Advanced)

A full-featured example showcasing:

Multiple worker threads
Interactive data grid (ag-Grid)
Real-time charts (eCharts)
Price alert system
Clean architecture

Demo

View Code

Quick Start

Clone repository

git clone https://github.com/nexconnectio/pynnex.git
cd pynnex

Install dependencies

pip install fastapi python-socketio uvicorn

Run any example

python examples/fastapi_socketio_simple.py
python examples/fastapi_socketio_qr.py
python examples/fastapi_socketio_stock_monitor.py

Then open http://localhost:8000 in your browser.

Key Features

Python worker threads for background processing
WebSocket for real-time updates
Event-driven architecture with emitter-listener pattern
Clean separation of concerns
No complex dependencies

Technical Details

PynneX provides a lightweight layer for:

emitter-listener pattern for event handling
Worker thread management
Thread-safe task queuing

Built with:

FastAPI for the web framework
SocketIO for WebSocket communication
Python's built-in threading and asyncio

Learn More

The examples above demonstrate how to build real-time web applications with clean thread communication patterns, without the complexity of traditional task queue systems.