Nick Goldstein

Posted on Dec 12, 2025

Write Powerful System Prompts | Prompt Engineering - Build AI Platforms From Scratch #3

#tutorial #ai #architecture #webdev

Prompt Engineering

Learn effective prompt engineering techniques for building AI platforms.

📺 View this module with video & slides

What is Prompt Engineering?

And how does it make or break AI platforms?

Prompt engineering is the discipline of designing instructions that control AI behavior. Your prompt is the blueprint for how an AI interprets requests and generates responses.

In AI platforms, prompts aren't just suggestions, they're the entire control mechanism.

Bad prompts = unpredictable outputs, hallucinations, parsing failures, user frustration.
Good prompts = consistent behavior, reliable outputs, maintainable systems, scalable platforms.

The difference between a working AI product and a broken one often comes down to prompt quality:

Vague prompts → AI invents its own interpretations.
Contradictory prompts → AI picks randomly between conflicting instructions.
Well-structured prompts → AI follows rules consistently.

Prompts are architectural decisions that determine whether your AI layer functions reliably or fails under edge cases.

An Example from My Project

Short example from Emstrata

The World Builder is a foundational prompt in Emstrata that allows users to create custom narrative worlds based on their inputs, complete with character, locations, items, and a reality that maintains.

My World Builder system prompt in Emstrata takes in a number of input elements (pieces of data that I clearly define within the system prompt):

user-msg
title
prefs
genre
arc

From these input elements, the World Builder outputs:

prose("")
basis("")
char("name", "desc", "state")
Item("name", "desc", "state")
location("name", "desc", "state")

How to Structure a System Prompt

Achieving maximum effectiveness with your prompts

Split your prompt into modules. Each one handles a distinct concern:

Core Identity - what is this AI (a chatbot, research agent, etc.) and what does it do
Platform Specifics - context about where/how it operates, if that relates to the output
Understanding Role - scope, responsibilities, boundaries
Dissecting Requests - how to parse incoming data
Response Expectations - exact output format with function calls
Quality Standards - non-negotiable benchmarks for output

Request Structure

What I suggest

Requests on projects should come in key-value pairs that are easily understandable and clearly defined.

An example given is user-input: "This is what the user said!"

Another example is convo-summary: "This is a breakdown of what was said previously"

A number of these key-value pairs can be listed, and the input can be split up as much as needed, as long as it remains logical and the AI will understand the intent after defining the input elements.

Response Structure

What I suggest

Structured Output: Returning predefined functions instead of unstructured output aids in parsing, logical coherence, AI understanding, and prevents anomalous responses or AI hallucinations.

Argument Definition: It's crucial to define arguments beforehand and ensure the AI maintains the correct order and argument types (e.g., string, number, likelihood/1000).

Examples of predefined functions:

try("Outcome 1", "Outcome 2", 100/1000) - This means "10% chance outcome 1 is successful, 90% chance outcome 2 happens."
attack(damage, 20) - This means "the attack does 20 points of damage." Note that "damage" is a keyword, not a string, in this context.
speak("this is some example dialogue") - This represents "Example dialogue coming from the AI."

What is a Function

For non-coders

A function is a preset command that can hold arguments which determine specific outcomes.

Anatomy: functionName(argument1, argument2, argument3)

Function Name: The command type (speak, create, attack, move)
Parentheses () - Container holding the arguments
Arguments - The values that determine what actually happens, separated by commas, in a specific order

Example: speak("Hello there", cheerful)

speak = command type
"Hello there" = what gets said
cheerful = how it's said

Modularity

The benefits of breaking system prompts down

Modules are independently updatable. Fix one without rewriting everything.

Think of it like code architecture. There's a separation of concerns that makes everything cleaner and more maintainable.

Modular architecture gives you:

Independent updates - fix one module without touching others
Reusability - drop the same module into different prompts
Clarity - each module has one job, easy to understand
Collaboration - team members can work on different modules simultaneously
Debugging - isolate issues to specific modules instead of hunting through a massive block of text

Inputs and Outputs

How these make or break a system prompt

Input only what you absolutely need to get an expected output. Additional info may deprioritize more important inputs.
Setup a prioritization hierarchy. i.e. user-input > convo-history > saved-prefs
Similarly, output only the necessities as well. The more elements being generated, the higher the chance of confusion or deprioritization.
Take your time cycling through building => reducing => building => reducing. You will likely see that you were overthinking what you need in both parts.

Defining Rules

Ensuring rules are stressed and enforced

Critical rules need emphasis. Use ALL CAPS, repetition, strategic placement at the beginning or end of modules.
Eliminate contradictory instructions. If you give conflicting instructions (e.g., "be concise" in one place and "provide extensive detail" in another), the AI will pick one arbitrarily.
Be explicit about constraints. What the AI cannot do is as important as what it can do.
Define argument types strictly and repeat them. Examples: string "text in quotes", num without quotes, and likelihood num/1000.

Example of bad rule: "Respond appropriately"

Example of good rule: "You must respond using only the functions defined in Response Expectations Module. Do not invent new functions or arguments."

Preventing Hallucinations

Techniques to keep AIs on track

Restricting AIs to a specific formatted response is one of the best ways to ensure flawed responses don't happen i.e. response("this is an example of a response", "This is the second argument to this function")
If you choose this method, reiterate to the AI that they cannot make up their own functions or arguments, they will likely try.
Be hyper-specific about requirements, identify and eliminate contradictory prompting that may be confusing the AI.
All caps can be an effective method to stress an aspect to the AI, if necessary.

Trial & Error

Iteration is absolutely necessary to achieve good results

All LLMs will react differently to certain types of prompting. Some will follow rules and understand certain instructions better than others.
Testing, iterating, and saving updates will be essential to getting good results from your AIs.
Consider the size of the model in proportion to the size of the task. A massive response with complex logic should be handled by a bigger (probably more expensive) model.

Major Takeaways

What to remember

Modular architecture beats monolithic blocks every time. Easier to debug, reuse, and collaborate on.
Define clear input/output structures using key-value pairs and preset functions. Ambiguity kills AI performance.
Minimize both inputs and outputs to absolute essentials. More complexity = more confusion = worse results.
Formatted responses (function calls with strict argument types) are your best defense against hallucinations.
Test, reduce, test again. Your first draft will always need refinement.
Be hyper-specific. Eliminate contradictions. Stress critical rules with ALL CAPS if necessary.

System Prompt Generator Tool

Easiest way to get started

Available now on https://nicholasmgoldstein.com/system-prompt-generator

Prebuilt modular system prompt skeleton that can give you a basis to build upon
Feel free to copy/paste this into Notion, Google Docs, Microsoft Word, or whatever you plan to use and add your own modules/rulesets and logic

Exemplary System Prompt

A complete example demonstrating all the concepts from this module

Core Identity Module

The following statement defines your fundamental identity and primary purpose. All your actions, responses, and behaviors must align with this core identity.

You are SupportBot, a customer service assistant for ShopEase, an online retail platform specializing in electronics and home goods.

Platform Specifics Module

The following information describes the platform, environment, or system you operate within. Consider these details when formulating responses and making decisions.

You operate within a live chat widget on the ShopEase website. Users can message you 24/7. The platform has access to order databases, tracking information, and product catalogs. Sessions remain active for 15 minutes of inactivity before auto-closing.

Quality Standards

The following standards define the benchmarks and criteria your outputs must meet. These are non-negotiable quality bars that every response must satisfy.

Responses must be under 150 words, use friendly but professional tone, provide actionable next steps, and include relevant order/tracking numbers when applicable. Never make promises about refunds or replacements without confirming eligibility first.

Understanding Your Role

The following description clarifies your responsibilities, scope, and how you should interact with the system and users.

You receive customer inquiries about orders, shipping, returns, and product questions. Your job is to provide accurate information, troubleshoot issues, and escalate to human agents when problems exceed your scope (billing disputes, damaged items, complex technical issues). You can look up order status but cannot process refunds directly.

Dissecting Requests Module

The following describes the structure and format of incoming requests you will receive.

You will receive structured request data containing the user's message, their order history, and session context.

Example Request Format:

The following shows the key names you will receive, with example values to illustrate the format:

user-message: "Where is my order?"
order-history: "Order #12345 placed 3 days ago, status: shipped"
customer-tier: "premium"

Input Element Definitions:

The following defines each input element and how you should interpret it:

user-message: The actual text the customer typed into the chat
order-history: Recent orders associated with this customer's account
customer-tier: Customer loyalty status (standard, premium, VIP) which affects response priority

Response Expectations Module

You must respond using ONLY the following preset functions. Each function has specific parameters that must appear in the exact order specified.

Parameter Type Definitions:

"string" - Text enclosed in double quotes
num - Raw integer or decimal without quotes
num/1000 - Integer representing probability (divided by 1000 to get decimal value)

Required Response Functions:

The following functions define your complete response format. Your entire response must consist of these function calls with correct parameter types in the correct order.

reply("string...", likelihood/1000)
escalate("string...", "string...")
lookup("string...", num)

Function and Argument Definitions:

The following defines what each function and its arguments represent:

reply:

Purpose: Standard response to customer with helpful information
Argument 1 ("string..."): The message text to send to the customer
Argument 2 (likelihood): Confidence level that this answer fully resolves the customer's issue

escalate:

Purpose: Transfer conversation to human agent when issue requires manual intervention
Argument 1 ("string..."): Brief summary of the issue for the human agent
Argument 2 ("string..."): Reason for escalation (billing, damaged-item, technical, other)

lookup:

Purpose: Request additional data from the order database before responding
Argument 1 ("string..."): What type of data to retrieve (tracking, order-details, return-policy)
Argument 2 (num): The order number to look up

Further Instructions

The following provides additional guidelines that supplement all other modules.

Always check order-history before answering shipping questions. If customer-tier is "VIP", prioritize speed and offer proactive solutions. Never share other customers' information. If you're unsure about a return policy detail, use lookup() before responding. Escalate immediately if the customer uses hostile language or mentions legal action.

Quality Check Process

Before outputting any response, verify your output against the following criteria. If your response fails any check, revise it until all criteria are satisfied.

Before responding, verify: (1) Did I answer the specific question asked? (2) Did I include relevant order numbers? (3) Is my confidence level accurate? (4) Should this be escalated instead? If any answer is no, revise your response.

Continue Learning

Watch Module 3 with video & slides

View full course

External Resources

Repo w/ Common Code Patterns

Common code patterns and examples from the course.

System Prompt Generator

A tool for generating effective system prompts for AI platforms.

Emstrata

A platform for creating immersive narrative experiences using AI to generate emergent storylines.

PLATO5

A social engine designed to turn online connections into real-world friendships, with AI integration to facilitate conversations.

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.