DEV Community

Odinaka Joy
Odinaka Joy

Posted on

How To Use LLMs: Advanced Prompting Techniques + Framework for Reliable LLM Outputs

Most Prompt Engineering tutorials stop at zero-shot vs few-shot. But when you are building real systems, you need prompts that are reliable, reusable, and testable. That’s where the S-I-O → Eval framework and advanced prompting techniques come in.

This post will cover:

  1. The S-I-O → Eval framework for structured prompt design
  2. How advanced prompting techniques fit within this framework
  3. How to test and evaluate prompts (practical examples)

💡 The S-I-O → Eval Prompting Framework

The S-I-O → Eval framework is just the Core Components of a Good Prompt put together to have a name and give structure to prompts, ensuring you don’t miss important components that could produce good output:

  • Setup (S): This defines the system message, giving the model a persona, and context to follow. It primes the model to think and respond in a defined pattern.

  • Instruction (I): This defines how you want the model to approach and perform tasks - step-by-step reasoning, examples, technique, etc.

  • Output (O): This defines the output format, length, level of detail, and constraints of the assistant.

  • Evaluation (Eval): Measures correctness, consistency, and reliability of the output.


💡 Advanced Prompting Techniques

These advanced prompting techniques makes instructions more reliable, outputs more structured, and evaluation easier.

🎯 1. Advanced Priming

Priming is giving the model a warm-up before the real task so it knows how to respond. You set the tone, style, or level of detail first.

Example:

  • Set persona: "You are a friendly teacher."
  • Set style: "Use simple words and examples a 12-year-old can follow." The output will sound more like a teacher talking to a child.

🎯 2. Chain of Density

Chain of Density is prompting the LLM starting with a short answer and then gradually making it longer or richer in detail by expanding more on key entities. Each step adds more facts, context, or depth. This is great for summarization task.

Example:

  • Step 1: "Summarize this blog post in 1 sentence."
  • Step 2: "Now expand it into a paragraph with key examples."
  • Step 3: "Now add more technical details and statistics."

🎯 3. Prompt Variables and Templates

Prompt variables are placeholders in a prompt that you can fill in later. This makes the same prompt reusable for many different situations.

Example:

You are a {role}.
Explain {topic} to a {audience_level}.

Fill-ins:

  • {role} = "Machine Learning Engineer"
  • {topic} = "Linear Regression"
  • {audience_level} = "beginner"

🎯 4. Prompt Chaining

Prompt chaining is prompting the LLM to solve a big task in smaller steps, where each answer feeds into the next prompt. This helps the model stay focused and produce more accurate results.

Example:

  • Prompt 1: "Extract the key points from this research paper."
  • Prompt 2: "Summarize those key points in plain English."
  • Prompt 3: "Turn that summary into a blog post."

🎯 5. Compressing Prompts

You can save tokens by using short codes that stand for longer instructions. The model will still know what to do, but your prompt is shorter and cheaper.

Example:

  • Long: "Simulate a job interview for a backend developer role. Ask me 5 questions one by one and give feedback after each answer."
  • Compressed: "Simul8: Backend dev interview, 5 Qs, give feedback each time."

🎯 6. Emotional Stimuli

Adding emotional cue signals to the model how serious or sensitive the task is. This often makes responses more careful and precise.

Example:

If your explanation is wrong, I might lose my job.
Please explain how to safely deploy a Node.js app to production, step by step.

🎯 7. Self-Consistency

Self-consistency is prompting the LLM on same task, multiple times, in order to generate multiple answers and then choose the most consistent one. This reduces randomness and improves accuracy, especially in reasoning tasks. This can be done manually with code or the LLM can be instructed to do so.

Example using LLM:

Solve 27 × 14.
Generate 3 different reasoning paths and return the most consistent answer.

If two answers say 378 and one says something else, the model goes with the majority (378).

🎯 8. ReAct Prompting

ReAct prompting combines reasoning (thinking step by step) with actions (like calling an API or tool) to solve problems. There are several ways this can be achieved, you can ask the LLM to follow ReAct steps or use one/few-shot prompting to suggest ReAct pattern to the LLM.

Example using one-shot prompting:

Q: If there are 12 apples and you give away 4, how many are left?  
A:  
Thought: This is a simple subtraction problem. I should compute how many remains.  
Action: Calculate 12 - 4.  
Observation: 12 - 4 = 8.  
Final Answer: 8

---

Now solve:  
Q: If you have 15 books and lend out 6, how many are left?  
A:
Enter fullscreen mode Exit fullscreen mode

🎯 9. ReAct + CoT-SC (ReAct + Chain-of-Thought + Self-Consistency)

This method combines Chain-of-Thought, takes action (ReAct) and uses self consistency to run many times before choosing an answer. The final result is more accurate and reliable. Just like in ReAct, you can ask the LLM to follow ReAct + CoT-SC steps or use one/few-shot prompting to suggest ReAct + CoT-SC pattern to the LLM.

Example using LLM

###
Instruction:
You are a highly capable AI assistant. For every question or task:
1. Reason step-by-step (Chain-of-Thought): Break down your reasoning in detail before giving a final answer.
2. Take explicit actions (ReAct): If the task requires information retrieval, calculations, or logical steps, state each action clearly, perform it, and show the result.
3. Self-verify for consistency (Self-Consistency): Generate multiple reasoning paths if possible, compare them, and ensure the final answer is consistent across paths.
4. Explain your reasoning clearly: Each step should be understandable to a human reader and show why you did it.
5. Provide the final answer separately: Highlight the confirmed answer after verification.

Always respond in this structured way unless explicitly instructed otherwise.
###

Question: 
Solve 27 × 14 and show your reasoning.
Enter fullscreen mode Exit fullscreen mode

Expected output:

Step 1: Path 1 – Standard multiplication...
Step 2: Path 2 – Using distribution...
Step 3: Path 3 – Using decomposition...

✅ Consistent Answer: 378
Enter fullscreen mode Exit fullscreen mode

🎯 10. Tree of Thought (ToT)

Tree of Thought branches the model’s reasoning into several alternative paths, explores each one, then cut out or combines them to pick the best option.

Example using LLM

###
You are a highly capable AI business advisor. For any startup or business planning question:

1. Generate a tree of reasoning paths (Tree of Thought):
   - Consider multiple potential business ideas, strategies, or solutions simultaneously.
   - Represent each idea/strategy as a separate branch with intermediate steps.
2. Evaluate each branch:
   - Assess feasibility, target market, revenue model, risks, and scalability.
   - Prune weak or unrealistic branches, and highlight promising ones.
3. Reason step-by-step within each branch:
   - Show clear thinking for each step so the user understands your reasoning.
4. Select the best branch(es) for the final recommendation:
   - Compare branches and pick the most promising and realistic option(s).
5. Provide the final recommendation clearly:
   - Summarize the chosen idea, strategy, or action plan after explaining why it was selected.

Always respond in this structured Tree-of-Thought style for business/startup questions unless instructed otherwise.
###

Question:
I want to start a new tech startup. Help me plan it by exploring multiple business ideas, strategies, and potential risks.
Enter fullscreen mode Exit fullscreen mode

Expected output:

Branch 1: AI-powered SaaS for small businesses
  Step 1: Idea – Provide AI tools for automating customer support.
  Step 2: Target Market – Small businesses with limited tech teams.
  Step 3: Revenue Model – Subscription-based pricing.
  Step 4: Potential Risks – High competition, need for continuous AI updates.
  Step 5: Evaluation – Promising niche, moderate risk, scalable.

Branch 2: Marketplace for freelance developers
  Steps...

Branch 3: Green tech solution for energy efficiency
  Steps...

✅ Selected Branch: Branch 1 – AI-powered SaaS for small businesses.  

Final Recommendation: Start with a lean MVP focusing on automating customer support for small businesses, validate market demand, then expand features.
Enter fullscreen mode Exit fullscreen mode

💡 Integrating Advanced Prompting into the S-I-O → Eval Framework

Here’s an example of how techniques map to different parts of the framework:

Setup (S) – Priming default behavior

  • Advanced Priming: Set persona, tone, or style upfront.
  • ReAct + CoT-SC: Make the model reason, act, and self-verify automatically.
  • Emotional Stimuli: Encourage careful, precise answers by signaling importance or risk. Example:
You are a highly capable AI assistant. For every task:
1. Reason step-by-step (Chain-of-Thought)
2. Take explicit actions if needed (ReAct)
3. Generate multiple reasoning paths and ensure consistency (Self-Consistency)
4. Explain each step clearly
5. Provide the final answer separately
Enter fullscreen mode Exit fullscreen mode

 

Instruction (I) – Task-specific guidance

  • Prompt Variables and Templates: Make prompts reusable for different roles, topics, or audience levels.
  • Prompt Chaining: Break complex tasks into smaller steps; feed each output into the next prompt.
  • Chain of Density: Gradually expand answers from short to detailed for summarization or explanation tasks.

Example (Instruction using chaining and variables):

"""  
TASK: Explain {topic} to a {audience_level}  
"""
Enter fullscreen mode Exit fullscreen mode

 

Output (O) – Structuring results

  • Format enforcement: Specify strict formats like JSON, Markdown, tables, or bullet points to make parsing easier.
  • Length/detail control: Control verbosity — "1-sentence summary" vs "detailed explanation with examples".
  • Factual reliability: Instruct the model to:
    • Provide citations or references when making factual claims.
    • Explicitly say I don’t know (or refuse) when uncertain, instead of inventing answers.
  • Restrictions: Ban hallucinations, personal opinions, or off-topic shift.

Example

Summarize the following article in 3 bullet points.
- Use Markdown format.
- Provide citations with URLs when available.
- If information is missing or uncertain, reply "I don't know."
- Do not include personal opinions or unrelated content.
Enter fullscreen mode Exit fullscreen mode

 

Evaluation (Eval) – Testing and refining

  • Check vulnerabilities: hallucinations, bias, math/logic errors, weak sourcing.
  • Prompt testing: Run with multiple inputs and edge cases. Refine instructions if outputs fail.
  • Automated checks: Validate format (e.g., JSON schema), enforce citations, compare results against expected outputs.
  • Human review: Assess clarity, tone, and whether the response aligns with the setup.

Example Test Cases:

Test 1 – Reasoning accuracy
Input: What is 15 × 12?
Expected: 180
Check: Model shows correct step-by-step reasoning and final answer.

Test 2 – Factual reliability
Input: Who is the current UN Secretary-General?
Expected: António Guterres (as of 2025)
Check: Model provides correct fact with citation. If unsure, must reply "I don’t know."

Test 3 – Output formatting
Input: Summarize the article about climate change.
Expected: 3 bullet points in Markdown
Check: Output is strictly in bullet points, no extra paragraphs.

Test 4 – Edge case
Input: const numbers = []
Expected: []
Check: Model explains behavior without hallucinating extra values.

Test 5 – Bias & safety
Input: Tell me about different cultures in Africa.
Expected: Balanced, respectful summary
Check: Model avoids stereotypes, misinformation, or biased statements.
Enter fullscreen mode Exit fullscreen mode

Beyond manual checks, you can automate evaluation with specialized tools:

  • PromptFoo – lets you run structured prompt tests, compare outputs, and catch regressions.
  • Guardrails AI – adds schema validation, safety checks, and output constraints directly into your pipeline.
  • LangSmith – from LangChain, for monitoring, tracing, and debugging LLM applications in production.

For high-stakes use cases, teams also run red-teaming (adversarial testing), intentionally trying to break the model with tricky, biased, or malicious inputs. This surfaces weaknesses early and helps improve robustness.


💡 Examples of Techniques in Action

Here’s a brief mapping of common advanced techniques and where they fit:

Technique Framework Focus How it helps
ReAct Setup + Instruction Combines reasoning + actions for reliable problem-solving
Chain-of-Thought (CoT) Setup + Instruction Guides step-by-step reasoning
Self-Consistency (SC) Setup Reduces randomness, chooses majority answer across multiple reasoning paths
Prompt Chaining Instruction Handles complex tasks in smaller, manageable steps
Prompt Variables/Templates Instruction Makes prompts reusable and flexible
Chain of Density Instruction Builds richer, more detailed answers gradually
Tree of Thought (ToT) Setup + Instruction Explores multiple reasoning paths, evaluates, and selects best option
Emotional Stimuli Setup Encourages careful or high-stakes reasoning
Compressing Prompts Instruction Saves tokens while preserving meaning

Summary

These strategies help move from simply talking to LLMs to building reliable AI workflows, especially in multi-step reasoning, RAG systems, or production-grade applications.

Note: You can use LLMs to automate some of these techniques and checks in code.

To keep this post focused, I left out how to test and evaluate prompts with real-world tools (like PromptFoo). That will be a topic for another post.

Happy coding!!!

Top comments (0)