Intern Project at Mantis Software: Turning an SRS into a Gothic RPG with AI Coding Agents

#agents #ai #gamedev #softwareengineering

A few weeks ago I wrapped up a project called Karanlığın Sesi (Voice of Darkness), a Turkish-language gothic RPG that runs in the browser and uses Mistral AI to generate the story. This is a writeup of the workflow behind it: what prompt engineering actually means in practice, why we wrote a formal SRS document before touching any code, how we converted that into a build plan for an AI coding agent, and what the experience of building with OpenCode Desktop was like.

What Prompt Engineering Is, And What Problem It Actually Solves
Prompt engineering is how you communicate intent to a language model in a way that produces reliable output. That sounds abstract, but the practical version is simple: vague inputs produce vague results.
If you ask an AI to "make a game," you'll get something. It won't be wrong, exactly, but it probably won't match what you had in mind. The more precisely you define what you want, what format you expect, and what constraints apply, the more useful the output becomes.
This matters at two levels when building AI-powered software. First, when you're using AI to help write code, prompt quality determines code quality. Second, when your application itself calls an AI model, the system prompts you write for that model shape the entire user experience. For our game, the Mistral system prompt is what determines whether the gothic atmosphere holds up across 10 story turns or falls apart.
A few things that actually matter when writing prompts:

Specify the output format explicitly. If you want JSON with exactly three elements in a list, say so. Don't assume the model will guess.
Provide context upfront. Don't make the model infer what kind of project it's in. Tell it directly.
Think about failure modes. What should the model return when it can't answer? This matters especially when you're parsing structured output programmatically.
Don't assume memory between requests. Each API call is stateless. For a narrative game, this means sending the full story history every time, not just the latest action.

Why We Wrote An SRS Before Touching Code
Before writing a line of code, we wrote a Software Requirements Specification: a formal document that defines what the system does, what it doesn't do, who uses it, and what the success criteria are. We used IEEE 830-1998 as a reference.
The main reason to do this is that it forces you to answer questions you haven't thought about yet. What are the four character types? What does the death screen say? What happens if the Mistral API returns an error mid-game? If you skip this step, you end up answering these questions during implementation, which is slower and messier.
The SRS also let us define the Pydantic schemas (BaslangicCiktisi for character creation and DevamCiktisi for story continuation) before touching Flask. These schemas are the contract between the AI's output and the application's behavior. Getting them right early saved debugging time later.
Things we paid attention to when writing it:
Scope boundaries. We explicitly listed what's out of scope: no authentication, no multiplayer, no mobile app. Writing it down prevents scope creep.
Non-functional requirements. API response time (5-15 seconds), rate limit handling (a one-second delay between requests), page load under two seconds excluding the API call. These constraints drove specific implementation decisions.
Assumptions and known limitations. We noted that Python 3.14 had dependency conflicts and that we'd target 3.10-3.11. We noted that Mistral's free tier has rate limits that could affect response times. Writing assumptions down means fewer surprises mid-build.
The SRS isn't the most exciting document to write. But going into a coding session with one is a noticeably different experience from going in without one.

Turning The SRS Into A CLAUDE.md
Once the SRS was done, we converted it into a CLAUDE.md, a project context document that AI coding tools like Claude Code read at the start of a session. It tells the agent what the project is, how it's structured, and step by step what to build.
The key difference between an SRS and a CLAUDE.md is the audience. An SRS is written for humans (developers, testers, project managers). A CLAUDE.md is written for an agent that will execute instructions literally and sequentially.
So we reorganized the SRS into a sequential build plan:

Project skeleton and dependencies
Pydantic schemas and Mistral connection
Flask API endpoints
HTML structure
JavaScript game logic
CSS gothic theme
End-to-end testing
Final polish and deployment prep

Each step has a self-contained prompt, specific enough that the agent knows exactly what to build and scoped enough that it doesn't try to do everything at once. A few lessons from writing these:
Break it into steps, not tasks. "Build the whole backend" is too broad. "Add the /api/start endpoint that validates input, calls Mistral with a character-specific prompt, and returns a BaslangicCiktisi JSON" is something an agent can act on.
Specify what not to do. Several steps end with "don't add X yet." This keeps the agent from getting ahead and creating inconsistencies.
Include the error cases. We specified what each endpoint should return on a 400 (invalid character type) and 500 (Mistral failure). Without this, agents tend to skip error handling.
Encode your technical decisions in the document. We decided early that narrative history would be capped at 4,000 characters before sending to Mistral, to avoid token limit issues. That decision lives in the CLAUDE.md, not just in someone's head.

Building With OpenCode Desktop
With the CLAUDE.md ready, we used OpenCode Desktop to build the project. OpenCode reads the project context, understands the file structure, and executes prompts against the actual filesystem.
The workflow: write a step prompt, paste it into OpenCode, review the output, fix what needed fixing, move to the next step.
The agent handled boilerplate well. File structure, requirements.txt, Flask route definitions, all clean on the first pass. Where it needed more guidance was in the prompt engineering for the game itself: how to phrase the Mistral system prompt so the AI would produce valid JSON consistently, and how to handle the case where Mistral sometimes wraps its JSON in markdown code blocks.
That second issue is worth naming. Models sometimes return

`{ "karakterAdi": "..." }`

instead of raw JSON despite structured output. If you don't strip those backticks before parsing, your Pydantic model throws an error. We added a small utility function to handle this, and it went into the CLAUDE.md testing step so the agent would include it.
The CSS step was the most pleasant surprise. The agent produced a genuinely atmospheric design: deep blacks, blood-red hover states, a gothic serif font pairing. It committed to the aesthetic in a way I'd have spent longer on by hand.
The actual split of work looked like this:
MeAgentPydantic schema designBoilerplate Flask setupMistral system prompt per character typeHTML structureStory truncation strategyCSS themingRate limit handling logicError handling patternsScope of each build stepFile organization
That split felt right. Decisions that require understanding the product stayed with me. Implementation that follows from those decisions went to the agent.

About The Game
Karanlığın Sesi (Voice of Darkness) is a single-player gothic RPG that runs in the browser. You pick one of four characters (Vampire, Detective, Witch, or Ghost) and Mistral generates a personalized opening: your character's name, backstory, birth year, the opening scene, and three actions to choose from.
Each action you pick sends the full story history back to Mistral, which continues the narrative consistently with everything that happened before. The game ends when the story reaches an irreversible conclusion, and the death screen appears.
The whole thing runs on Flask, uses no frontend framework, and has no user accounts. Deliberately simple infrastructure so the actual experience can be interesting.
The CLAUDE.md build plan we used is available if you want to try something similar. The SRS document was written by Nazife Durmaz.

Tools used: Claude (SRS review, CLAUDE.md authoring), OpenCode Desktop (code generation), Mistral AI (story generation runtime), Flask, Pydantic.

You can find the source code at Mantis Intern's Github.