Alexander Ivanov

Posted on May 29

I Built a Python Prompt Orchestrator for Structured LLM Pipelines

#ai #llm #rag #promptengineering

Most LLM applications eventually hit the same problem:

prompts become unmanageable.

At first, everything fits into a single string.

Then you add:

summaries
RAG
memory
safety checks
token budgets
conversation compaction
provider switching

And suddenly your prompt pipeline becomes harder to maintain than the model itself.

So I built prompt_orchestrator.

What is it?

prompt_orchestrator is a Python module for structured prompt orchestration with:

static/semi-stable/dynamic prompt layout
configurable summarization providers
optional RAG integration
safety heuristics
token budgeting
centralized configuration
prompt efficiency analysis

The goal was simple:

Make prompt pipelines deterministic, modular, and production-friendly.

Structured prompt sections

The orchestrator separates prompts into:

static parts
semi-stable parts
dynamic conversation context

This improves:

cacheability
token efficiency
prompt readability
debugging

Works with or without RAG

The module supports optional RAG providers.

It integrates directly with rag_orchestrator and compatible retrieval systems.

One particularly useful detail:

Both projects share a compatible DocChunk structure.

This makes integration extremely simple.

Safety checks included

The project includes lightweight safety heuristics for:

injection detection
contradiction checks

without requiring a separate moderation service.

Summary providers

Supported summary backends:

OpenAI
Ollama
deterministic local fallback
custom providers

So the orchestration layer is not tied to a single vendor.

Token-aware orchestration

The orchestrator includes:

token counting via tiktoken
automatic trimming
prompt fitting
configurable token budgets

which becomes critical for long-running conversations.

Designed for integration

The module was intentionally designed to integrate into existing systems.

It does not force:

a framework
an agent runtime
a specific LLM provider
a database stack

Tests and simulations

The repository already includes:

interactive simulations
safety simulations
conversation replay tests
console pipelines

which makes experimentation easy.

Installation

pip install -e .

Final thoughts

A lot of current LLM tooling focuses on:

agents
autonomous loops
framework ecosystems

But prompt orchestration itself is still an unsolved infrastructure problem.

This project focuses specifically on making that layer cleaner and easier to reason about.

Top comments (1)

Harjot Singh • May 31

A prompt orchestrator for structured pipelines is exactly the right layer to build, because the moment you go past a single prompt you need the things ad-hoc chains lack: typed/validated outputs between steps, retries on schema-mismatch, branching, and a way to inspect what each stage actually produced. The "structured" part is the key word - if step 2 consumes step 1's output, that handoff has to be validated, not hoped-for, or one malformed JSON quietly poisons the rest of the pipeline. Orchestration is where the reliability actually lives; the individual prompts are the easy part.

This is squarely the space I build in, so I'm genuinely curious about your design choices. Moonshift, the thing I work on, is a multi-agent pipeline that takes a prompt to a deployed SaaS, and the lessons that kept biting me are: validate the structured output at each boundary (don't trust the model's JSON), make steps idempotent/retryable, and cap cost/steps so a pipeline can't run away. Multi-model routing keeps a full build ~$3 flat, first run free no card. Nice build. Two questions: do you validate/enforce the schema between stages (Pydantic-style), or trust the model to return well-formed structure? And do you route different steps to different models, or one model for the whole pipeline? The per-step model routing is a big cost/quality lever once the pipeline's real.