Prompt Engineering
Learn effective prompt engineering techniques for building AI platforms.
📺 View this module with video & slides
What is Prompt Engineering?
And how does it make or break AI platforms?
Prompt engineering is the discipline of designing instructions that control AI behavior. Your prompt is the blueprint for how an AI interprets requests and generates responses.
In AI platforms, prompts aren't just suggestions, they're the entire control mechanism.
- Bad prompts = unpredictable outputs, hallucinations, parsing failures, user frustration.
- Good prompts = consistent behavior, reliable outputs, maintainable systems, scalable platforms.
The difference between a working AI product and a broken one often comes down to prompt quality:
- Vague prompts → AI invents its own interpretations.
- Contradictory prompts → AI picks randomly between conflicting instructions.
- Well-structured prompts → AI follows rules consistently.
Prompts are architectural decisions that determine whether your AI layer functions reliably or fails under edge cases.
An Example from My Project
Short example from Emstrata
The World Builder is a foundational prompt in Emstrata that allows users to create custom narrative worlds based on their inputs, complete with character, locations, items, and a reality that maintains.
My World Builder system prompt in Emstrata takes in a number of input elements (pieces of data that I clearly define within the system prompt):
user-msgtitleprefsgenrearc
From these input elements, the World Builder outputs:
prose("")basis("")char("name", "desc", "state")Item("name", "desc", "state")location("name", "desc", "state")
How to Structure a System Prompt
Achieving maximum effectiveness with your prompts
Split your prompt into modules. Each one handles a distinct concern:
- Core Identity - what is this AI (a chatbot, research agent, etc.) and what does it do
- Platform Specifics - context about where/how it operates, if that relates to the output
- Understanding Role - scope, responsibilities, boundaries
- Dissecting Requests - how to parse incoming data
- Response Expectations - exact output format with function calls
- Quality Standards - non-negotiable benchmarks for output
Request Structure
What I suggest
Requests on projects should come in key-value pairs that are easily understandable and clearly defined.
An example given is user-input: "This is what the user said!"
Another example is convo-summary: "This is a breakdown of what was said previously"
A number of these key-value pairs can be listed, and the input can be split up as much as needed, as long as it remains logical and the AI will understand the intent after defining the input elements.
Response Structure
What I suggest
Structured Output: Returning predefined functions instead of unstructured output aids in parsing, logical coherence, AI understanding, and prevents anomalous responses or AI hallucinations.
Argument Definition: It's crucial to define arguments beforehand and ensure the AI maintains the correct order and argument types (e.g., string, number, likelihood/1000).
Examples of predefined functions:
-
try("Outcome 1", "Outcome 2", 100/1000)- This means "10% chance outcome 1 is successful, 90% chance outcome 2 happens." -
attack(damage, 20)- This means "the attack does 20 points of damage." Note that "damage" is a keyword, not a string, in this context. -
speak("this is some example dialogue")- This represents "Example dialogue coming from the AI."
What is a Function
For non-coders
A function is a preset command that can hold arguments which determine specific outcomes.
Anatomy: functionName(argument1, argument2, argument3)
- Function Name: The command type (speak, create, attack, move)
- Parentheses () - Container holding the arguments
- Arguments - The values that determine what actually happens, separated by commas, in a specific order
Example: speak("Hello there", cheerful)
-
speak= command type -
"Hello there"= what gets said -
cheerful= how it's said
Modularity
The benefits of breaking system prompts down
Modules are independently updatable. Fix one without rewriting everything.
Think of it like code architecture. There's a separation of concerns that makes everything cleaner and more maintainable.
Modular architecture gives you:
- Independent updates - fix one module without touching others
- Reusability - drop the same module into different prompts
- Clarity - each module has one job, easy to understand
- Collaboration - team members can work on different modules simultaneously
- Debugging - isolate issues to specific modules instead of hunting through a massive block of text
Inputs and Outputs
How these make or break a system prompt
- Input only what you absolutely need to get an expected output. Additional info may deprioritize more important inputs.
- Setup a prioritization hierarchy. i.e.
user-input > convo-history > saved-prefs - Similarly, output only the necessities as well. The more elements being generated, the higher the chance of confusion or deprioritization.
- Take your time cycling through building => reducing => building => reducing. You will likely see that you were overthinking what you need in both parts.
Defining Rules
Ensuring rules are stressed and enforced
- Critical rules need emphasis. Use ALL CAPS, repetition, strategic placement at the beginning or end of modules.
- Eliminate contradictory instructions. If you give conflicting instructions (e.g., "be concise" in one place and "provide extensive detail" in another), the AI will pick one arbitrarily.
- Be explicit about constraints. What the AI cannot do is as important as what it can do.
- Define argument types strictly and repeat them. Examples: string
"text in quotes", num without quotes, and likelihoodnum/1000.
Example of bad rule: "Respond appropriately"
Example of good rule: "You must respond using only the functions defined in Response Expectations Module. Do not invent new functions or arguments."
Preventing Hallucinations
Techniques to keep AIs on track
- Restricting AIs to a specific formatted response is one of the best ways to ensure flawed responses don't happen i.e.
response("this is an example of a response", "This is the second argument to this function") - If you choose this method, reiterate to the AI that they cannot make up their own functions or arguments, they will likely try.
- Be hyper-specific about requirements, identify and eliminate contradictory prompting that may be confusing the AI.
- All caps can be an effective method to stress an aspect to the AI, if necessary.
Trial & Error
Iteration is absolutely necessary to achieve good results
- All LLMs will react differently to certain types of prompting. Some will follow rules and understand certain instructions better than others.
- Testing, iterating, and saving updates will be essential to getting good results from your AIs.
- Consider the size of the model in proportion to the size of the task. A massive response with complex logic should be handled by a bigger (probably more expensive) model.
Major Takeaways
What to remember
- Modular architecture beats monolithic blocks every time. Easier to debug, reuse, and collaborate on.
- Define clear input/output structures using key-value pairs and preset functions. Ambiguity kills AI performance.
- Minimize both inputs and outputs to absolute essentials. More complexity = more confusion = worse results.
- Formatted responses (function calls with strict argument types) are your best defense against hallucinations.
- Test, reduce, test again. Your first draft will always need refinement.
- Be hyper-specific. Eliminate contradictions. Stress critical rules with ALL CAPS if necessary.
System Prompt Generator Tool
Easiest way to get started
Available now on https://nicholasmgoldstein.com/system-prompt-generator
- Prebuilt modular system prompt skeleton that can give you a basis to build upon
- Feel free to copy/paste this into Notion, Google Docs, Microsoft Word, or whatever you plan to use and add your own modules/rulesets and logic
Exemplary System Prompt
A complete example demonstrating all the concepts from this module
Core Identity Module
The following statement defines your fundamental identity and primary purpose. All your actions, responses, and behaviors must align with this core identity.
You are SupportBot, a customer service assistant for ShopEase, an online retail platform specializing in electronics and home goods.
Platform Specifics Module
The following information describes the platform, environment, or system you operate within. Consider these details when formulating responses and making decisions.
You operate within a live chat widget on the ShopEase website. Users can message you 24/7. The platform has access to order databases, tracking information, and product catalogs. Sessions remain active for 15 minutes of inactivity before auto-closing.
Quality Standards
The following standards define the benchmarks and criteria your outputs must meet. These are non-negotiable quality bars that every response must satisfy.
Responses must be under 150 words, use friendly but professional tone, provide actionable next steps, and include relevant order/tracking numbers when applicable. Never make promises about refunds or replacements without confirming eligibility first.
Understanding Your Role
The following description clarifies your responsibilities, scope, and how you should interact with the system and users.
You receive customer inquiries about orders, shipping, returns, and product questions. Your job is to provide accurate information, troubleshoot issues, and escalate to human agents when problems exceed your scope (billing disputes, damaged items, complex technical issues). You can look up order status but cannot process refunds directly.
Dissecting Requests Module
The following describes the structure and format of incoming requests you will receive.
You will receive structured request data containing the user's message, their order history, and session context.
Example Request Format:
The following shows the key names you will receive, with example values to illustrate the format:
user-message: "Where is my order?"
order-history: "Order #12345 placed 3 days ago, status: shipped"
customer-tier: "premium"
Input Element Definitions:
The following defines each input element and how you should interpret it:
-
user-message: The actual text the customer typed into the chat -
order-history: Recent orders associated with this customer's account -
customer-tier: Customer loyalty status (standard, premium, VIP) which affects response priority
Response Expectations Module
You must respond using ONLY the following preset functions. Each function has specific parameters that must appear in the exact order specified.
Parameter Type Definitions:
-
"string"- Text enclosed in double quotes -
num- Raw integer or decimal without quotes -
num/1000- Integer representing probability (divided by 1000 to get decimal value)
Required Response Functions:
The following functions define your complete response format. Your entire response must consist of these function calls with correct parameter types in the correct order.
reply("string...", likelihood/1000)escalate("string...", "string...")lookup("string...", num)
Function and Argument Definitions:
The following defines what each function and its arguments represent:
reply:
- Purpose: Standard response to customer with helpful information
- Argument 1 ("string..."): The message text to send to the customer
- Argument 2 (likelihood): Confidence level that this answer fully resolves the customer's issue
escalate:
- Purpose: Transfer conversation to human agent when issue requires manual intervention
- Argument 1 ("string..."): Brief summary of the issue for the human agent
- Argument 2 ("string..."): Reason for escalation (billing, damaged-item, technical, other)
lookup:
- Purpose: Request additional data from the order database before responding
- Argument 1 ("string..."): What type of data to retrieve (tracking, order-details, return-policy)
- Argument 2 (num): The order number to look up
Further Instructions
The following provides additional guidelines that supplement all other modules.
Always check order-history before answering shipping questions. If customer-tier is "VIP", prioritize speed and offer proactive solutions. Never share other customers' information. If you're unsure about a return policy detail, use lookup() before responding. Escalate immediately if the customer uses hostile language or mentions legal action.
Quality Check Process
Before outputting any response, verify your output against the following criteria. If your response fails any check, revise it until all criteria are satisfied.
Before responding, verify: (1) Did I answer the specific question asked? (2) Did I include relevant order numbers? (3) Is my confidence level accurate? (4) Should this be escalated instead? If any answer is no, revise your response.
Continue Learning
Watch Module 3 with video & slides
External Resources
Repo w/ Common Code Patterns
Common code patterns and examples from the course.
System Prompt Generator
A tool for generating effective system prompts for AI platforms.
Emstrata
A platform for creating immersive narrative experiences using AI to generate emergent storylines.
PLATO5
A social engine designed to turn online connections into real-world friendships, with AI integration to facilitate conversations.
Top comments (0)