Most LLM applications eventually hit the same problem:
prompts become unmanageable.
At first, everything fits into a single string.
Then you add:
- summaries
- RAG
- memory
- safety checks
- token budgets
- conversation compaction
- provider switching
And suddenly your prompt pipeline becomes harder to maintain than the model itself.
So I built prompt_orchestrator.
What is it?
prompt_orchestrator is a Python module for structured prompt orchestration with:
static/semi-stable/dynamic prompt layout
configurable summarization providers
optional RAG integration
safety heuristics
token budgeting
centralized configuration
prompt efficiency analysis
The goal was simple:
Make prompt pipelines deterministic, modular, and production-friendly.
Structured prompt sections
The orchestrator separates prompts into:
- static parts
- semi-stable parts
- dynamic conversation context
This improves:
- cacheability
- token efficiency
- prompt readability
- debugging
Works with or without RAG
The module supports optional RAG providers.
It integrates directly with rag_orchestrator and compatible retrieval systems.
One particularly useful detail:
Both projects share a compatible DocChunk structure.
This makes integration extremely simple.
Safety checks included
The project includes lightweight safety heuristics for:
- injection detection
- contradiction checks
without requiring a separate moderation service.
Summary providers
Supported summary backends:
- OpenAI
- Ollama
- deterministic local fallback
- custom providers
So the orchestration layer is not tied to a single vendor.
Token-aware orchestration
The orchestrator includes:
- token counting via tiktoken
- automatic trimming
- prompt fitting
- configurable token budgets
which becomes critical for long-running conversations.
Designed for integration
The module was intentionally designed to integrate into existing systems.
It does not force:
- a framework
- an agent runtime
- a specific LLM provider
- a database stack
Tests and simulations
The repository already includes:
- interactive simulations
- safety simulations
- conversation replay tests
- console pipelines
which makes experimentation easy.
Installation
pip install -e .
Final thoughts
A lot of current LLM tooling focuses on:
- agents
- autonomous loops
- framework ecosystems
But prompt orchestration itself is still an unsolved infrastructure problem.
This project focuses specifically on making that layer cleaner and easier to reason about.
Top comments (1)
A prompt orchestrator for structured pipelines is exactly the right layer to build, because the moment you go past a single prompt you need the things ad-hoc chains lack: typed/validated outputs between steps, retries on schema-mismatch, branching, and a way to inspect what each stage actually produced. The "structured" part is the key word - if step 2 consumes step 1's output, that handoff has to be validated, not hoped-for, or one malformed JSON quietly poisons the rest of the pipeline. Orchestration is where the reliability actually lives; the individual prompts are the easy part.
This is squarely the space I build in, so I'm genuinely curious about your design choices. Moonshift, the thing I work on, is a multi-agent pipeline that takes a prompt to a deployed SaaS, and the lessons that kept biting me are: validate the structured output at each boundary (don't trust the model's JSON), make steps idempotent/retryable, and cap cost/steps so a pipeline can't run away. Multi-model routing keeps a full build ~$3 flat, first run free no card. Nice build. Two questions: do you validate/enforce the schema between stages (Pydantic-style), or trust the model to return well-formed structure? And do you route different steps to different models, or one model for the whole pipeline? The per-step model routing is a big cost/quality lever once the pipeline's real.