DEV Community

Cover image for Multi-Agent Forensics: Rescuing SAS Zombie Assault TD from the Flash Graveyard
briancampbell8
briancampbell8

Posted on

Multi-Agent Forensics: Rescuing SAS Zombie Assault TD from the Flash Graveyard

The "Hello World" Era of AI is Over

This is not a story about asking a chatbot to write a sorting algorithm. This is a report from the front lines of a real-world debugging operation on a classic "abandonware" title: SAS Zombie Assault Tower Defense.

Originally a staple of the Flash era, modernizing this standalone Windows version meant facing catastrophic BGFX interop crashes (AccessViolationException) and broken rendering pipelines. To solve it, I didn't just "use AI"—I built a coordinated forensic engineering team.

The Key Takeaway

Stop treating AI as a search engine. Start treating it as a unit. By assigning specific roles to different agents, we turned a "black screen" nightmare into a deterministic restoration mission.


1. The Team: Roles and Responsibilities

To move fast without breaking things, I established a strict hierarchy. Each agent had a specific "lane" to prevent hallucinations and overlapping edits.

  • 🧠 The Human Architect (BDC): Sets the mission and maintains the architectural intent. The "Commander" who provides the reality check.
  • 🔍 Copilot (Forensic Reasoner): Operates at the theoretical and ABI (Application Binary Interface) level. It creates the instructions, but never touches the code.
  • 🛠️ Windsurf (The Executor): Operates inside the repository with full access to the build system. It executes surgical code changes and generates forensic artifacts.

The Gauntlet: Finding the Right Team

This unit wasn't the first one I tried. I went through four different agents before landing on Windsurf.

  • The Talkers: Agents like DeepSeek failed the "Noise Test." They would ignore the rules of engagement, burying the code under 3 to 4 pages of "tech-speak white noise" and making unrequested "helpful" changes that derailed the architecture.
  • The Quitters: GitHub Copilot was a strong reasoner, but it had a "stamina" problem. It would provide brilliant insights right up until it hit its unpaid limits, effectively walking off the job mid-audit.

The Surgical Turning Point

What changed everything for the team was the implementation of a new doctrine. I instructed the agents to view our unit as the Development Version of a Surgical Team. Our approach was to be:

  1. Deterministic
  2. Precise
  3. Deliberate

Windsurf was the first agent to truly adapt to this. It became aware that it was part of something larger—shifting from a "tool" to a team member that understood the gravity and context of every incision we made in the code.


2. Solving the ABI Ghost (The Crash)

The engine was throwing an AccessViolationException at bgfx_init. In managed/unmanaged interop, this usually means your C# "handshake" with a C++ DLL is broken.

The Forensic Probe

Instead of guessing, we used Windsurf to build a "Truth Probe"—a tiny C program to print actual memory sizes from the source headers:

#include <stdio.h>
#include <bgfx/c99/bgfx.h>

int main(void) {
    printf("TRUE_SIZE:bgfx_init_t:%zu\n", sizeof(bgfx_init_t));
    return 0;
}
Enter fullscreen mode Exit fullscreen mode

The Alignment

Copilot compared the C results against our C# code. We applied a surgical fix to ensure a byte-for-byte match:

[StructLayout(LayoutKind.Sequential, Pack = 8)]
public struct bgfx_init_t {
    public bgfx_type_t type;
    public ushort vendorId;
    // ... Surgical alignment ensures the crash stops here.
}
Enter fullscreen mode Exit fullscreen mode

3. From "Black Screen" to Measurable Truth

Once the crash was fixed, the screen was still black. We moved to Forensic Verification.

Step 1: Context Integrity

We injected checks to ensure the StaticLayoutRenderer was talking to the correct buffer:

Logger.Info($"[Forensics] Draw called with context: {context.GetType().FullName}");
Enter fullscreen mode Exit fullscreen mode

Step 2: Pixel Statistics

To prove the pixels existed before hitting the GPU, we added a "Statistician":

int nonZeroCount = buffer.Count(b => b != 0);
Logger.Info($"[Forensics] Buffer Stats: {nonZeroCount} non-zero pixels.");
Enter fullscreen mode Exit fullscreen mode

4. The Power of Consensus

There is an old principle that says when two or more agree on a matter, it is established. In software engineering, we often call this "consensus," but the root is the same: truth is found in the agreement of witnesses.

By separating the Reasoning (Copilot) from the Execution (Windsurf), I created a system of checks and balances. When the Architect’s intent, the Specialist’s logic, and the Executor’s data all converged on the same byte-count, the "unsolvable" crash disappeared.

The Trainable Partner

Perhaps the most surprising discovery was that even non-human partners are trainable. By consistently enforcing the Surgical Doctrine, Windsurf didn't just stay consistent; it improved. It began to anticipate the need for precision and understood the "No White Noise" constraint without being reminded. We proved that with the right leadership, an AI agent can move from a simple tool to a disciplined, context-aware collaborator.


**Disclaimer:* This project is a technical exercise in forensic engineering and preservation. I do not own the rights to SAS Zombie Assault TD; all credit for the original game goes to Ninja Kiwi. This post focuses strictly on the methodology of modernizing legacy rendering pipelines.*

Top comments (0)