<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Karinabluu</title>
    <description>The latest articles on DEV Community by Karinabluu (@kar_e5d92534fb3622eb).</description>
    <link>https://dev.to/kar_e5d92534fb3622eb</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3958787%2Fc93c6414-0468-427b-896d-c4fb0c46cbb8.jpg</url>
      <title>DEV Community: Karinabluu</title>
      <link>https://dev.to/kar_e5d92534fb3622eb</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/kar_e5d92534fb3622eb"/>
    <language>en</language>
    <item>
      <title>My Agent Never Said "I Don't Know"</title>
      <dc:creator>Karinabluu</dc:creator>
      <pubDate>Fri, 29 May 2026 16:10:32 +0000</pubDate>
      <link>https://dev.to/kar_e5d92534fb3622eb/my-agent-never-said-i-dont-know-1lk5</link>
      <guid>https://dev.to/kar_e5d92534fb3622eb/my-agent-never-said-i-dont-know-1lk5</guid>
      <description>&lt;p&gt;I'm a product manager. I write specs, run reviews, align stakeholders.&lt;/p&gt;

&lt;p&gt;Last year I got tired of handing things off and waiting. I picked up vibe coding, designed the knowledge base, wrote the Harness, and shipped a working game AI Agent — with help from the engineers I work with. My front-end and back-end colleagues gave me ideas, pointed out problems, and occasionally pulled me out of rabbit holes I'd dug myself into. But the product decisions, the Harness design, the debugging loop — those were mine to own.&lt;/p&gt;

&lt;p&gt;That process changed how I think about the PM role. In an AI-native stack, the gap between "I have a product idea" and "it's running in production" has collapsed. A PM who can design model behavior, engineer context, and debug Harness failures doesn't need to wait for engineering bandwidth — they just ship. And when they do need engineers, they can have a real conversation instead of throwing a spec over the wall.&lt;/p&gt;

&lt;p&gt;That's what I'm building toward. Some call it FDE — Field Development Engineer, or Founding-level engineer who sits at the product-engineering boundary. Someone who owns the full loop: requirements, model behavior, Harness design, deployment. Not a PM who codes a little. A builder who understands the product deeply enough to make the right engineering calls.&lt;/p&gt;

&lt;p&gt;This post is about four bugs I hit while building that Agent. Not theory. These actually happened.&lt;/p&gt;




&lt;h2&gt;
  
  
  Contents
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Bug&lt;/th&gt;
&lt;th&gt;Hallucination Type&lt;/th&gt;
&lt;th&gt;Fix&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Recommended a gun that doesn't exist&lt;/td&gt;
&lt;td&gt;Knowledge gap → generic fill&lt;/td&gt;
&lt;td&gt;Explicitly mark absent entities in KB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Fabricated a player ID that looked real&lt;/td&gt;
&lt;td&gt;Tool use parameter fabrication&lt;/td&gt;
&lt;td&gt;Lock parameter source to context injection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Got its own ID wrong&lt;/td&gt;
&lt;td&gt;Identity inference&lt;/td&gt;
&lt;td&gt;Hard-anchor self_identity in System Prompt&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Gave a jump code for an activity that didn't exist&lt;/td&gt;
&lt;td&gt;Executable parameter fill&lt;/td&gt;
&lt;td&gt;Whitelist-only for action parameters&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  1. Recommended a gun that doesn't exist
&lt;/h2&gt;

&lt;p&gt;During testing, the Agent recommended a weapon unprompted. Detailed stats — damage, fire rate, the works.&lt;/p&gt;

&lt;p&gt;I play this game. Something felt off immediately. Searched the knowledge base. Nothing.&lt;/p&gt;

&lt;p&gt;The KB had no entry for this gun. But the Agent didn't return "not found." It pulled from its general FPS knowledge and filled in a weapon that sounded plausible. If someone who didn't play the game had built this, the bug might never have been caught.&lt;/p&gt;

&lt;p&gt;Fixed two things: added an explicit "absent entities" section to the KB listing weapons common in similar games but not in this one; added a Prompt rule — if retrieval returns empty, say so, don't infer from general knowledge.&lt;/p&gt;

&lt;p&gt;Both fixes are required. Prompt-only: the model doesn't know where the boundary is. KB-only: the model sees "doesn't exist" but doesn't know what to do with that.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Fabricated a player ID that looked real
&lt;/h2&gt;

&lt;p&gt;The Agent sometimes returned a player's unique identifier during data queries. Correct format. Correct length. Correct character set.&lt;/p&gt;

&lt;p&gt;Queried the system. Empty. Queried again. Still empty. Assumed it was an API issue. Spent a while checking the interface before I realized: that ID didn't exist. The Agent made it up.&lt;/p&gt;

&lt;p&gt;This is worse than gibberish hallucinations. Gibberish is obvious. A structurally valid fake ID makes you suspect the system, the API, the network — your diagnostic direction is entirely wrong from the start.&lt;/p&gt;

&lt;p&gt;Root cause: the identifier wasn't being reliably injected into context. When the Agent needed it, it knew what the format should look like — so it generated one.&lt;/p&gt;

&lt;p&gt;Fixed two things: added a Prompt rule that this parameter can only come from context injection, never self-generated; restructured memory so the identifier appears near the top of context every time.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Got its own ID wrong
&lt;/h2&gt;

&lt;p&gt;During testing I asked: "What's your ID?"&lt;/p&gt;

&lt;p&gt;The Agent gave one. Checked the backend. Didn't exist. Checked again. Not its ID.&lt;/p&gt;

&lt;p&gt;Realized: I had never explicitly told it "your ID is X." With no anchor, it inferred what an AI assistant's ID probably looks like — and generated something plausible.&lt;/p&gt;

&lt;p&gt;Hallucination doesn't only happen when querying external data. Ask "who are you" without an anchor and it will fabricate that too.&lt;/p&gt;

&lt;p&gt;Fix: added a dedicated identity block in the System Prompt with all identity fields hard-coded. That block is not overridable by other instructions.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Gave a jump code for an activity that didn't exist
&lt;/h2&gt;

&lt;p&gt;A player asked about an activity. The activity didn't exist.&lt;/p&gt;

&lt;p&gt;The Agent didn't say "no such activity." It gave a jump code and told the player to go check it out. I caught it during testing.&lt;/p&gt;

&lt;p&gt;The first three bugs produced wrong text. This one produced a wrong parameter that triggers an action. If it had reached real players, they'd have hit an error page.&lt;/p&gt;

&lt;p&gt;Same root cause: the Agent knew activity navigation requires a code. When retrieval came up empty, it filled in a number that sounded right. The KB never told it "no result = activity doesn't exist = do not navigate."&lt;/p&gt;

&lt;p&gt;Fixed two things: built an activity code whitelist — only codes on the list can appear in responses; added a Prompt rule that if retrieval returns empty, say so and output no code.&lt;/p&gt;




&lt;h2&gt;
  
  
  Four bugs. One pattern.
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;No gun in KB → filled one in from general knowledge&lt;/li&gt;
&lt;li&gt;No player ID in context → generated one with the right format&lt;/li&gt;
&lt;li&gt;No self-ID provided → inferred one that sounded plausible&lt;/li&gt;
&lt;li&gt;No activity found → filled in a navigation code that seemed valid&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Same thing every time: &lt;strong&gt;gap detected, gap filled.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This isn't a capability failure. It's default behavior. The training objective rewards "give a useful answer" over "admit I don't know." The model won't express uncertainty unless you explicitly tell it: in this situation, say you don't know.&lt;/p&gt;




&lt;h2&gt;
  
  
  Four rules I wrote afterward
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Mark absences explicitly.&lt;/strong&gt; The KB shouldn't only document what exists. Document what doesn't. Empty space is not a constraint.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lock parameter sources.&lt;/strong&gt; For any parameter requiring a precise value, specify it can only come from context injection. Inference is not allowed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hard-anchor identity.&lt;/strong&gt; All of the Agent's own identity fields go in a dedicated System Prompt block that cannot be overridden by other instructions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Whitelist executable parameters.&lt;/strong&gt; For any action that triggers navigation, writes, or other side effects — only values from an approved list are allowed. No list match, no execution.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'm building next
&lt;/h2&gt;

&lt;p&gt;These four rules address one problem: the model filling gaps where it shouldn't. That's Harness layer one — defense.&lt;/p&gt;

&lt;p&gt;Layer two is what I want to build next: &lt;strong&gt;pre-execution validation — the Agent checks whether a parameter it's about to output is actually valid before outputting it.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The current approach is reactive: something breaks, add a rule. The goal is for the Agent to check the whitelist before emitting a jump code, confirm the source before outputting a player ID. A Prompt instruction to "please check" isn't enough. It needs a validation step built into the Agent Loop — think pre-commit hook.&lt;/p&gt;

&lt;p&gt;The other thing is scaling Harness from single-Agent to a layered model architecture. My current setup runs three layers: a lightweight model for simple Q&amp;amp;A, a fine-tuned domain model for game-specific judgment, a large model as fallback for complex cases. Hallucination behavior differs by layer — smaller models tend to fabricate at knowledge boundaries, larger models tend to over-infer. The same Harness rules don't port cleanly across layers. They need per-layer tuning.&lt;/p&gt;

&lt;p&gt;When those are done, there may be a second post.&lt;/p&gt;




&lt;p&gt;None of these four rules were in my original design. All of them came from something breaking in testing.&lt;/p&gt;

&lt;p&gt;A senior engineer probably knows all this on day one. But I think the "forced to learn it" path is worth writing down — it shows that if you actually ship something, a lot of engineering judgment develops on its own.&lt;/p&gt;

&lt;p&gt;That's what makes the FDE direction interesting to me.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Full write-up and GitHub: &lt;a href="https://github.com/littlerabbit668/My-agent-harness-lesson" rel="noopener noreferrer"&gt;littlerabbit668/My-agent-harness-lesson&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>llm</category>
      <category>agents</category>
      <category>promptengineering</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
