Tinyfishie

Posted on May 19 • Originally published at tinyfish.ai

How to Write Goals That Get Consistent Results from AI Agents

#howtowritegoalsforaiagents #webagents #goalbasedautomation #aiprompting

When you write a prompt for a chatbot, the worst outcome is a bad answer. You try again.

When you write a goal for a web agent, the worst outcome is wrong data that passes downstream validation, an extraction that returns half the results on every run, or — in a transactional workflow — an action you didn't intend. A bad goal doesn't fail loudly. It fails quietly, returning a result that looks plausible until someone checks it against the source.

Goal writing for web agents has its own patterns. This guide covers the four elements every reliable goal needs, and a reference table of goal patterns for common use cases.

Why Goal Writing Is Different from Prompt Engineering

Prompt engineering is about getting a model to respond well. Goal writing is about instructing an agent to act reliably.

The distinction matters in practice:

A prompt gets evaluated once. The model reads it, generates a response, done.
A goal gets interpreted at each decision point as the agent navigates. Every time the agent decides whether to click something, where to look next, or how to structure its output, it refers back to the goal.

This means a vague goal doesn't produce one bad response — it produces inconsistent decisions across every step of every run. Two agents running the same vague goal against the same page can take different paths and return different results.

Prompt engineering advice — be specific, give examples, define the format — applies here too. But goal writing adds two things that chatbot prompting doesn't need: scope (where on the page to operate) and fallback behavior (what to return when expected elements aren't there).

The Four Elements of a Reliable Goal

Every goal that produces consistent results has these four properties. Missing any one of them is the most common source of extraction failures.

Element	What it defines	What happens without it
Scope	Which section/page to focus on	Agent wanders, extracts from wrong sections
Task	What specific action or extraction to perform	Agent guesses what counts as completion
Output format	The structure of the returned data	Each run returns data shaped differently
Fallback behavior	What to do when expected elements are missing	Silent failures, partial results, null fields

Weak goal:

Get the product information.

Strong goal:

In the main product listing grid (not featured or sponsored products),
extract the name, current price, and availability status for each product.
Return as JSON array: [{"name": string, "price": string, "availability": "in_stock" | "out_of_stock" | "unknown"}].
If a field is not visible for a product, use null — do not navigate to individual product pages.
Limit to 20 products.

The weak goal has none of the four elements. The strong goal has all of them.

Specifying Output Format

Unspecified output format is the single most common cause of inconsistent agent results. When you don't define the structure, the agent makes its own decisions — and those decisions can vary between runs.

From free-form to schema

from tinyfish import TinyFish

client = TinyFish()

# ❌ No output format — agent decides the structure
response = client.agent.run(
    url="https://marketplace.example.com/listings",
    goal="Extract the product listings."
)
# Result varies: sometimes a list, sometimes a dict, sometimes nested

# ✅ Explicit schema — consistent every time
response = client.agent.run(
    url="https://marketplace.example.com/listings",
    goal=(
        "Extract each product listing on this page. "
        'Return as JSON array: [{"id": string, "title": string, '
        '"price": number, "currency": "USD", "seller": string}]. '
        "If a field is not present, use null."
    )
)

Choose the right schema style for your use case

For structured extraction (most common): Specify field names, types, and what null means.

Return as JSON: {"title": string, "price": number, "in_stock": boolean}

For list extraction with many items: Define the array shape and element schema.

Return as JSON array, one object per result:
[{"rank": number, "url": string, "title": string, "snippet": string}]

For multi-value fields: Be explicit about how to handle them.

If multiple prices are shown (sale price + original price), return both:
{"sale_price": number, "original_price": number, "currency": string}
If only one price is shown, use {"price": number, "currency": string}

For boolean fields: Define what each state means.

"availability": true if "In Stock" or "Available" is shown, false if "Out of Stock"
or "Unavailable", null if no availability information is visible

Scoping Goals to Specific Page Sections

Broad goals produce broad results. A page that has a main listing grid, a featured products sidebar, a recently viewed section, and a sponsored row will produce overlapping, duplicate data if your goal doesn't specify which section to use.

Add location context

# ❌ Broad — agent extracts from all sections
"Extract all product prices on this page."

# ✅ Scoped — agent focuses on the right section
"In the main product listing grid (not the featured sidebar, sponsored section,
or recently viewed row), extract the price and name for each product."

Scope patterns by page type

E-commerce listing page:

In the main search results grid, extract... (not sponsored listings at the top)

Pricing page with comparison table:

In the pricing comparison table, extract each plan's name, monthly price,
and the features listed under 'Included'. Do not extract from the FAQ section.

Search results:

Extract the organic search results (not ads, not 'People also ask', not featured snippets).
For each result: title, URL, and snippet text.

Portal with sidebar navigation:

In the main content area (not the sidebar navigation), extract...

When in doubt, describe what NOT to include. Exclusions are often easier to specify than inclusions.

Writing Fallback Instructions

Silent failures happen when the agent completes a run, returns a result, but the expected element wasn't present — and the goal gave no instruction for that case. The agent fills the gap with its best guess.

Explicit fallback instructions tell the agent exactly what to return when something is missing:

Pattern: missing field

Extract the product's SKU from the product details section.
If no SKU is visible on this page, return null for that field.
Do not navigate to other pages to find it.

Pattern: multiple values found

Extract the current price. If both a sale price and a regular price are shown,
return both: {"sale_price": number, "original_price": number}.
If only one price is shown, return {"price": number}.

Pattern: conditional content

Extract the inventory count if displayed. If inventory count is not shown
but the item is listed as available, return {"in_stock": true, "quantity": null}.
If the item shows "Out of Stock", return {"in_stock": false, "quantity": 0}.

Pattern: pagination

Extract listings from this page only. If the page shows fewer than 10 listings,
return what's available — do not navigate to the next page.

The rule: every state the page could be in should have an explicit instruction. If your goal covers the happy path but not the edge cases, you'll get inconsistent results whenever a page varies from the expected state.

Safety Instructions for Transactional Workflows

For goals that involve forms, account actions, or any workflow with write access, explicitly define what the agent should and should not do. Safety instructions don't limit what your agent can do — they scope the task to precisely the intended action.

response = client.agent.run(
    url="https://portal.example.com/orders",
    goal=(
        "Log in using the provided credentials. "
        "Navigate to the Orders section and extract the 10 most recent orders. "
        "For each order: order ID, date, status, and total amount. "
        "Return as JSON array. "
        "IMPORTANT: Do not click any buttons that modify order status. "
        "Do not submit any forms. Do not proceed to any checkout or payment screens. "
        "If you reach a page that requires payment information, stop and return "
        '{"error": "reached_payment_page"}.'
    )
)

The safety instruction block at the end of a transactional goal is the standard pattern:

IMPORTANT: Do not [specific prohibited action]. Do not [second prohibited action].
If you encounter [edge case], stop and return {"error": "[description]"}.

Authenticated workflows operate on accounts you're authorized to access — safety instructions define the scope of what the agent does within that access, giving you precise control over each run.

Goal Patterns by Use Case

Copy-pasteable patterns for common scenarios. Replace the bracketed placeholders with your specifics.

Use case	Goal pattern	Key elements to customize
Product price extraction	`"In the [section] on this page, extract the [name/price/availability] for each product. Return as JSON array: [{\"name\": string, \"price\": number, \"currency\": string}]. If price is not shown, use null."`	Section name, field list, currency
Search result extraction	`"Extract the organic search results (not ads). For each: title, URL, snippet. Return as JSON array, max [N] results."`	N, any additional fields
Multi-step form submission	`"Fill the [form name] with: [field]: [value], [field]: [value]. Click [button]. IMPORTANT: Do not proceed past the confirmation screen. Return the confirmation message or order ID shown."`	Form fields, stop condition
Authenticated portal data	`"Navigate to [section] using the provided credentials. Extract [data]. Return as JSON. Do not submit any forms or modify any data."`	Section, data fields
Price comparison across pages	`"Extract the current [plan name] price from the pricing page. Return: {\"plan\": string, \"monthly_price\": number, \"annual_price\": number, \"currency\": string}. Use null for any price not shown."`	Plan name, price fields
Inventory monitoring	`"Check the availability status of [product identifier] on this page. Return: {\"available\": boolean, \"quantity\": number or null, \"status_text\": string}. Capture the exact text shown for availability."`	Product ID, status fields
Document/content extraction	`"Extract the [content type] from the [section] of this page. Do not extract navigation, footers, or sidebar content. Return as plain text preserving paragraph breaks."`	Content type, exclusions

Test these patterns on your actual target pages.

TinyFish gives you 500 free credits to run goal-based agents against any URL — no credit card required. Start with one of the patterns above and see what consistent, structured output looks like.

Frequently Asked Questions

How long should an agent goal be?

As long as it needs to be to cover all four elements — scope, task, output format, and fallback behavior. A one-line goal that doesn't specify output format will produce inconsistent results. A five-sentence goal that covers all four elements will produce consistent results. Length isn't a signal of quality; completeness is. In practice, most reliable goals run 50–150 words.

Should I put the output format at the beginning or end of the goal?

End. Start with scope and task so the agent understands what it's doing before it encounters the format instruction. Ending with the format means the agent is already oriented when it reads the schema — it's not trying to simultaneously understand what it's extracting and how to structure it.

Can I use the same goal for multiple URLs?

Yes — as long as the page structure is consistent across those URLs. If the target field (price, availability, etc.) appears in a different location or under a different element type on some pages, your results will vary. For large URL lists with structural variance, write a classification step first and route to different goals by page type.

What's the difference between a goal and a script (Playwright, Selenium)?

A script defines every step explicitly: click this, wait for that, read this selector. A goal defines the desired outcome and lets the agent figure out the steps. Scripts are brittle when page structure changes; goals adapt. Goals require more careful output specification because there's no explicit selector telling the agent exactly which element to read — the agent infers from the goal text.

How do I debug a goal that's producing inconsistent results?

Use streaming to watch the agent's steps during a run — this shows you where it diverges from the expected path. Then check: (1) Is the scope ambiguous? The agent might be looking in the wrong section. (2) Is the output format specified? Unspecified format produces structural variation. (3) Is there a fallback for the failing case? A missing fallback produces silent failures. See the full debugging guide for the complete diagnostic sequence.

DEV Community