DEV Community

Cover image for I Reduced My System Prompt Tokens by 70% Using a Custom Prompt DSL
Kiran Reddy Duvvuru
Kiran Reddy Duvvuru

Posted on

I Reduced My System Prompt Tokens by 70% Using a Custom Prompt DSL

What if we've been writing prompts the wrong way?

For the last two years, the AI community has focused on writing better prompts.

We've all written prompts that look something like this:

You are an expert product strategist, UX designer, SEO consultant, copywriter, researcher, workflow architect...

Then we add:

  • Rules
  • Constraints
  • Output formats
  • Safety instructions
  • Examples
  • Exceptions
  • More rules

Soon a single system prompt becomes 500–5000 tokens.

While building MiniMind AI, I started asking a different question:

Why are we writing long English essays to LLMs?

Why not create a structured instruction language specifically designed for AI systems?

A language that:

  • Is compact
  • Is machine-friendly
  • Can be compressed
  • Is easy to generate automatically
  • Saves tokens

That question led to an experiment that eventually became:

  • Prompt Builder
  • Prompt Optimizer
  • Prompt Architect Suite Workflow
  • CSP v1 (Compressed System Prompt)

Existing Prompt Compression Research

Prompt compression is already an active research field.

Microsoft's LLMLingua project demonstrated that prompts can often be compressed significantly while preserving performance.

The LLMLingua family introduced:

  • LLMLingua
  • LongLLMLingua
  • LLMLingua-2

along with a growing ecosystem of prompt optimization techniques.

The goal of most prompt compression research is:

Remove unnecessary tokens while preserving meaning.

My experiment explored something slightly different:

Replace verbose English instructions with a compact AI-readable DSL.


The Experiment

I created three versions of the same system prompt.

The task:

Generate a complete SaaS landing page structure for MiniMind AI.


Version A — Traditional English Prompt

You are an expert AI product designer, SaaS landing page strategist,
UX copywriter, SEO advisor, and conversion optimization specialist.

You should generate a complete SaaS landing page structure.

You must include:
- Hero
- Problem
- Solution
- Features
- Benefits
- Use Cases
- FAQ
- SEO Metadata

You should maintain a practical tone.
You should avoid hype.
You should mention human-in-the-loop workflows.
You should mention exports to PDF, JSON, and Markdown.

...
Enter fullscreen mode Exit fullscreen mode

Estimated Size:

~683 tokens


Version B — Structured Prompt

Instead of large paragraphs:

ROLE:
  primary: SaaS_Landing_Page_Strategist

CONTEXT:
  brand: MiniMind AI

MANDATORY_PAGE_SECTIONS:
  - hero
  - problem
  - solution
  - features
  - benefits
  - faq

STYLE:
  tone: practical
  format: markdown

CONSTRAINTS:
  - no_fake_claims
  - seo_friendly
Enter fullscreen mode Exit fullscreen mode

Estimated Size:

~494 tokens

Reduction:

~28%


Version C — CSP v1 (Compressed System Prompt)

This was the interesting one.

CSP:v1

ROLE=SaaSLandingStrategist|UXCopy|SEO

GOAL=Generate SaaS landing page

CTX=[
 MiniMindAI,
 human_in_loop,
 exports
]

RULES=[
 practical_tone,
 seo_friendly,
 markdown_output,
 no_fake_claims
]

OUT={
 sections:[
  hero,
  problem,
  solution,
  features,
  benefits,
  faq,
  seo
 ]
}
Enter fullscreen mode Exit fullscreen mode

Estimated Size:

~215 tokens

Reduction:

~69%


What I Expected

I expected Version C to fail.

Honestly.

I assumed the model would struggle with compressed syntax.

I expected:

  • Missing sections
  • Reduced quality
  • Poor instruction following

I was wrong.


The Result

Gemini successfully generated:

  • Hero section
  • Problem section
  • Solution section
  • Features
  • Benefits
  • Use cases
  • FAQ
  • SEO metadata

It correctly understood:

  • Human-in-the-loop workflows
  • Web research
  • Structured outputs
  • PDF exports
  • JSON exports
  • Markdown exports

All from a prompt that was roughly 70% smaller.

The output wasn't identical.

Some nuance was lost.

But the quality remained surprisingly high.


Why Did This Work?

Modern LLMs are trained on enormous quantities of:

  • JSON
  • YAML
  • XML
  • Source code
  • API specifications
  • Configuration files
  • GitHub repositories

A structure like:

RULES=[
 no_fake_claims,
 seo_friendly,
 markdown_output
]
Enter fullscreen mode Exit fullscreen mode

may actually be easier for a model to interpret than several paragraphs of prose.

The model isn't simply reading language.

It's building an internal representation of constraints.


The Real Insight

I no longer think the interesting idea is:

Prompt Compression

I think the more interesting idea is:

Prompt DSL

A domain-specific language for AI instructions.

Instead of:

You are an expert competitor research assistant.
Use web search.
Always cite sources.
Do not hallucinate.
Generate markdown output.
Require approval.
Enter fullscreen mode Exit fullscreen mode

We might write:

ROLE=CompetitorResearch

TOOLS=[
 web_search
]

RULES=[
 cite_sources,
 markdown_output
]

FORBID=[
 hallucinations
]

APPROVAL=true
Enter fullscreen mode Exit fullscreen mode

Prompt DSL vs Traditional English

Metric English DSL
Human Friendly Excellent Good
Token Efficiency Poor Excellent
Machine Readability Good Excellent
Version Control Moderate Excellent
Auto Generation Difficult Easy
Visual Editor Friendly Difficult Excellent

Building It Into MiniMind AI

This experiment eventually became three products inside MiniMind AI.

1. Prompt Builder

The Prompt Builder converts raw requirements into professional prompts.

Example:

Input:

Build a competitor research agent
with web search and source citations.
Enter fullscreen mode Exit fullscreen mode

Output:

Professional System Prompt
Professional Agent Prompt
Professional Workflow Prompt
Enter fullscreen mode Exit fullscreen mode

Instead of manually writing structure, guardrails, output contracts, formatting requirements, and instructions, the tool generates them automatically.

Try it:

MiniMind AI Prompt Builder


2. Prompt Optimizer

The Prompt Optimizer takes an existing prompt and compresses it into structured formats such as:

  • CSP v1
  • JSON
  • YAML
  • XML
  • Markdown
  • Optimized English

The goal isn't simply token reduction.

The goal is reducing tokens while preserving behavior.

For example, one of my tests produced:

Original Estimated Tokens: 567
Optimized Estimated Tokens: 323

Estimated Tokens Saved: 244
Estimated Reduction: 43%

Compression Level: Balanced
Reliability Risk: Low
Enter fullscreen mode Exit fullscreen mode

The optimizer also produces:

Removed or Merged Items

  • Duplicate instructions
  • Repeated formatting requirements
  • Conversational filler
  • Redundant role descriptions

Preserved Critical Rules

  • Guardrails
  • Output contracts
  • Tool permissions
  • Security constraints
  • Negative rules
  • Citation requirements

This turned out to be just as valuable as the compression itself because it provides visibility into what changed.

Try it:

MiniMind AI Prompt Optimizer


3. Prompt Architect Suite Workflow

Once Prompt Builder and Prompt Optimizer existed, the next logical step was combining them into a complete workflow.

Instead of generating prompts manually, the workflow guides the user through the entire prompt-engineering process.

Raw Requirement
        ↓
Prompt Builder
        ↓
Prompt Analysis
        ↓
Optimization Settings
        ↓
Prompt Optimizer
        ↓
Compression Report
        ↓
Final Prompt Package
        ↓
PDF Export
Enter fullscreen mode Exit fullscreen mode

Try it:

[MiniMind AI Prompt architect suite](https://www.minimindai.com/workflows/prompt-architect-suite

)

What the Workflow Produces

The workflow doesn't just generate a prompt.

It generates a complete handoff package.

Executive Handoff Summary

A concise overview explaining:

  • What was generated
  • What was optimized
  • What was preserved
  • What changed

Original Prompt

The professional prompt generated from the user's raw requirement.


Optimized Prompt

The compressed CSP v1 version.

Example:

CSP:v1

ROLE=CompetitorResearch

GOAL=Analyze competitors

RULES=[
 cite_sources,
 separate_facts_assumptions
]

OUT=markdown
Enter fullscreen mode Exit fullscreen mode

Compression Report

Example:

Original Tokens: 567
Optimized Tokens: 323

Reduction: 43%

Reliability Risk: Low
Enter fullscreen mode Exit fullscreen mode

Preservation Analysis

The workflow automatically identifies:

Preserved

  • Security requirements
  • Guardrails
  • Output formats
  • Approval requirements
  • Citations
  • Constraints

Removed or Merged

  • Duplicate wording
  • Repeated instructions
  • Redundant examples
  • Conversational filler

Exportable Deliverables

The final package can be exported as:

  • PDF
  • JSON
  • Markdown

allowing prompts to be:

  • Versioned
  • Reviewed
  • Shared
  • Reused
  • Audited

across teams and projects.


A Note About Quality Evaluation

One important note.

The quality comparisons in this experiment were informal.

I compared outputs manually and evaluated whether optimized prompts appeared to preserve the original behavior, structure, and constraints.

I did not perform rigorous benchmark-based evaluation across large datasets or standardized task suites.

What surprised me wasn't that compression worked.

What surprised me was how much compression was possible before noticeable quality degradation appeared.

In multiple tests, prompts reduced by 40–70% still produced outputs that were remarkably similar to their original versions.

That's not proof that a prompt DSL is universally better.

But it was enough to convince me that structured instruction languages such as CSP v1 are worth exploring further.

How Quality Was Evaluated

The quality evaluation in this experiment was informal.

I manually compared outputs generated from the English prompt and the compressed CSP v1 version and evaluated whether the major requirements, structure, constraints, and output sections were preserved.

I did not run benchmark-based evaluations or large-scale task testing.

What surprised me was not that compression worked, but how much compression was possible before noticeable quality degradation appeared.

Why This Matters for Agents

A chatbot may use a single system prompt.

An agent system may use:

  • Planner Prompt
  • Research Prompt
  • Reviewer Prompt
  • Writer Prompt
  • Evaluator Prompt

If each prompt contains 1000 tokens:

5 agents × 1000 tokens
=
5000 prompt tokens
Enter fullscreen mode Exit fullscreen mode

Reduce those prompts by 60%.

The savings become significant.

Especially for large-scale agent systems and workflow platforms.


CSP v1 (Compressed System Prompt)

My current favorite format is:

CSP v1

Example:

CSP:v1

ROLE=CompetitorResearch

GOAL=Analyze competitors

RULES=[
 cite_sources,
 separate_facts_assumptions
]

OUT=markdown
Enter fullscreen mode Exit fullscreen mode

Simple.

Readable.

Compressible.

Machine-friendly.


The Bigger Question

We moved from:

Machine Code

Programming Languages

because humans shouldn't write machine instructions directly.

Maybe prompting follows a similar path.

Maybe future developers won't write giant English system prompts.

Maybe they will write:

ROLE=
GOAL=
RULES=
GUARDRAILS=
OUT=
Enter fullscreen mode Exit fullscreen mode

and let compilers and optimization tools handle the rest.

I'm not claiming CSP v1 is the future.

But after building Prompt Builder, Prompt Optimizer, and Prompt Architect Suite, one thing became clear:

Modern LLMs understand structured instruction languages far better than I expected.

And that opens some very interesting possibilities.


What Do You Think?

Would you rather maintain:

  • A 3000-token English system prompt

or

  • A structured DSL that can be versioned, compressed, generated, and optimized automatically?

I'm curious whether others have experimented with prompt DSLs or prompt compilers.

Top comments (0)