Doraking

Posted on May 23

NovelPilot: A Novel Writing Agent Powered by Gemma 4

#devchallenge #gemmachallenge #gemma

Gemma 4 Challenge: Build With Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

Most AI story generators work like this:

prompt in → wall of text out

That is useful, but it does not feel like a real writing process.

When people write fiction, they do not only generate paragraphs. They plan the premise, design characters, build the world, structure the plot, manage foreshadowing, write scenes, edit style, check continuity, and prepare the final piece for readers.

So I built NovelPilot.

NovelPilot is a Gemma 4-powered AI writing room that turns one prompt into a complete story creation pipeline.

One prompt goes in.

Nine agents start working.

A finished story comes out.

What I built

NovelPilot is a web app that helps users create short fiction through a structured multi-agent workflow.

The user starts with a simple prompt, such as:

Write a melancholic sci-fi mystery set in modern Tokyo. A graduate student who lost his memory investigates a disappearance in a quantum computing lab.

Then NovelPilot launches a sequence of specialized AI agents:

Premise Architect
Character Director
World Builder
Plot Strategist
Chapter Architect
Prose Writer
Style Editor
Continuity Detective
Publisher Agent

Each agent performs a specific part of the writing process.

The result is not just a generated story. It is a full creative package:

Story concept
Character profiles
Worldbuilding notes
Plot structure
Chapter outline
Chapter 1 draft
Style editor report
Foreshadowing tracker
Continuity detective report
Title ideas
Publication summary
Browser reading mode
Polished PDF export

NovelPilot is designed to demonstrate Gemma 4 as a multi-agent creative reasoning engine, not just a text completion model.

Demo

Live demo: https://novelpilot.vercel.app

How to try it:

Open the live demo.
Click Run Judge Demo.
Watch the nine-agent pipeline complete.
Read the finished novel in the browser.
Review the Foreshadowing Tracker and Continuity Detective.
Download the final story as a polished PDF.

The Judge Demo works without an API key, so reviewers can test the full experience immediately.

For live generation, NovelPilot supports Gemma 4 through a provider abstraction, with OpenRouter as the recommended provider.

Sample prompt and output

Here is the sample prompt I used to test NovelPilot.

The protagonist is Ren Kanzaki, a 24-year-old graduate student working in a quantum computing laboratory. A few days ago, he lost part of his memory. He cannot remember what he was researching, why his professor suddenly disappeared, or why his own name appears in an old experimental log.

The story begins on a rainy night in Tokyo. Ren enters the university research building after midnight and finds an old experiment log hidden inside a locked drawer. On the final page, he sees the sentence:

“Ren Kanzaki will be removed from the observation target as of today.”

The story should focus on quiet tension, memory gaps, emotional unease, and the unsettling atmosphere of the laboratory. Avoid flashy action. Let the mystery emerge through scenery, silence, dialogue, and small contradictions.

Main theme:
If memories disappear, can a person still remain the same self?

Main characters:
- Ren Kanzaki: A graduate student who lost part of his memory. Calm and intelligent, but emotionally repressed.
- Mio Shiraishi: Ren’s labmate. She knows something about Ren’s memory loss but refuses to tell him the truth.
- Professor Kuon: The missing professor. He was researching quantum memory transfer.
- Associate Professor Kurosaki: The person currently managing the laboratory. He seems helpful, but some of his statements contradict the records.

Tone:
Intellectual, quiet, melancholic, slightly literary, and mysterious.

Language: en
Genre: sci-fi
Tone: melancholic
Target Length: short-story

I also exported the generated story as a polished PDF.

Sample output PDF: Download the generated novel PDF

This PDF was generated directly from NovelPilot’s finished reader view.

Code

GitHub repo: https://github.com/dorakingx/novelpilot

Tech stack:

Next.js App Router
TypeScript
Tailwind CSS
shadcn/ui-style components
Gemma 4 provider abstraction
OpenRouter-compatible live mode
Mock mode for the zero-setup judge demo
Browser-based polished PDF export
Vercel deployment

The app has two main modes:

Mode	Purpose
Demo / Mock Mode	Lets judges try the full workflow without an API key
Live Mode	Uses Gemma 4 through the configured provider

The provider layer is intentionally isolated in lib/gemma.ts, so the model provider can be changed without rewriting the app.

How I used Gemma 4

Gemma 4 is the reasoning engine behind the multi-agent writing pipeline.

NovelPilot uses Gemma 4 for:

structured story concept generation
character design
worldbuilding
plot planning
chapter outlining
prose drafting
style editing
foreshadowing tracking
continuity auditing
publisher copy generation

Each agent receives the accumulated story bible and previous structured outputs.

This means Gemma 4 is not just generating paragraphs. It acts as the structural memory and reasoning layer for the whole novel creation process.

The important design decision was to make every agent return structured data whenever possible. That allows the UI to render the model output as real product features: timelines, cards, reports, trackers, reader views, and exports.

Why I chose this Gemma 4 model

For the live version, NovelPilot is designed to use a Gemma 4 model through OpenRouter.

I chose this approach because the app needs strong reasoning and structured generation across multiple steps. The model must follow JSON schemas, preserve context from earlier agents, and reason about story structure, character consistency, and foreshadowing.

NovelPilot focuses especially on:

long-context creative reasoning
structured JSON generation
story memory across multiple steps
continuity checking
literary planning and drafting

Gemma 4 is a good fit because the project is not only asking the model to write a paragraph. It asks the model to behave as a coordinated writing room.

What makes NovelPilot different

Most AI writing tools generate text.

NovelPilot generates a writing process.

The user does not only receive a draft. They see how the story is built:

Prompt
  ↓
Premise
  ↓
Characters
  ↓
World
  ↓
Plot
  ↓
Chapter outline
  ↓
Draft
  ↓
Style edit
  ↓
Continuity audit
  ↓
Publisher package
  ↓
Reader view
  ↓
PDF export

This makes the output easier to inspect, revise, and trust.

Key feature: Foreshadowing Tracker

One of my favorite parts is the Foreshadowing Tracker.

Instead of only writing a draft, NovelPilot tracks story threads like this:

{
  "item": "The cracked silver watch",
  "introducedIn": "Chapter 1",
  "status": "unresolved",
  "suggestedPayoff": "It reveals the exact time the protagonist's memory was overwritten.",
  "payoffChapter": "Chapter 3",
  "emotionalPurpose": "Connects guilt, identity, and lost time."
}

This makes the output more useful for writers.

It also shows why a structured model workflow matters. The app is not only asking Gemma 4 to write prose. It is asking Gemma 4 to reason about narrative structure.

Key feature: Continuity Detective

The Continuity Detective checks the generated story for structural problems.

It returns issues with:

category
severity
evidence
suggested fix

Example structure:

{
  "category": "foreshadowing",
  "severity": "high",
  "issue": "The experiment log is introduced as important but has no planned payoff.",
  "evidence": "The log appears in Chapter 1 and is referenced in the outline, but no chapter resolves its origin.",
  "suggestedFix": "Reveal in the final chapter that the log was written by an earlier version of the protagonist."
}

This was important to me because many AI writing tools can generate plausible fiction, but fewer tools help the user understand whether the story actually holds together.

Final reader experience

After all agents finish, NovelPilot automatically transitions into a Completed Novel Reader.

The user can read the finished story directly in the browser.

They can also go back to the Agent Workspace to inspect:

agent outputs
story bible
foreshadowing tracker
continuity report
publisher package

The final reader is not a one-way screen. Users can freely move between the production workflow and the finished novel.

PDF export

I also added polished PDF export.

Instead of relying on the browser’s default print layout, NovelPilot generates a designed A4-style manuscript PDF.

The PDF includes:

cover page
novel title
metadata
chapter title
formatted manuscript body
optional story notes

This makes the app feel closer to a complete writing product, not just a demo.

UI/UX design

I wanted the app to feel like an AI creative studio.

The flow has three stages:

1. Prompt Launcher

The first screen is focused.

The user only sees:

prompt input
language
genre
tone
target length
Generate Story
Run Judge Demo

This keeps the experience simple.

2. Agent Workspace

After generation starts, the app transitions into the agent workspace.

This screen shows:

active agent timeline
story bible
foreshadowing tracker
manuscript preview
continuity detective
export tools

3. Completed Novel Reader

When all agents finish, the app opens the final reading screen.

The user can read the story, download a PDF, or go back to review the agent outputs.

Technical architecture

The core architecture is simple:

app/page.tsx
  Main app phase control:
  launcher → workspace → reader

lib/useStoryProject.ts
  Client-side orchestration of the pipeline

app/api/generate-agent/route.ts
  Runs one agent per request

lib/gemma.ts
  Provider abstraction for Gemma 4 / OpenRouter / mock mode

lib/prompts.ts
  Prompt templates for each writing agent

lib/agents.ts
  Merges structured agent outputs into the Story Bible

lib/types.ts
  Shared TypeScript types

components/
  Prompt launcher, agent workspace, reader, trackers, reports, export panels

The app uses a state-first architecture because this is a hackathon project. I intentionally avoided authentication, databases, and user accounts so the core experience stays fast and easy to judge.

Agent workflow

Here is the high-level pipeline:

User Prompt
  ↓
Premise Architect
  ↓
Character Director
  ↓
World Builder
  ↓
Plot Strategist
  ↓
Chapter Architect
  ↓
Prose Writer
  ↓
Style Editor
  ↓
Continuity Detective
  ↓
Publisher Agent
  ↓
Completed Novel Reader + PDF Export

Each step builds on the previous one.

For example, the Character Director does not work from the original prompt alone. It receives the premise and theme created by the Premise Architect.

The Plot Strategist receives the concept, characters, and worldbuilding.

The Continuity Detective receives the story bible, chapter outline, draft, and previous reports.

This makes the app feel like an actual production pipeline rather than a single model call.

What I learned

The biggest lesson was that structured outputs are more powerful than plain prose outputs for creative tools.

A single prose response is hard to inspect.

But structured outputs can become:

timelines
cards
story bibles
trackers
reports
reader views
exports

I also learned that judge experience matters.

That is why I added Run Judge Demo. Reviewers can experience the full product without configuring an API key.

Another lesson was that a creative AI product should not end at “generation complete.” It should end with something the user can actually consume. That is why I added the final reader and PDF export.

Challenges

The biggest challenge was balancing autonomy and control.

If the app is too automatic, it feels like the user has no creative role.

If the app asks for too much input, it stops feeling agentic.

So I designed NovelPilot around this principle:

The AI agents do the heavy lifting, but the user can always review, regenerate, edit, read, and export.

Another challenge was making the final output feel complete. The Completed Novel Reader and PDF export helped turn the generated draft into something closer to a finished product.

What’s next

I would like to add:

full multi-chapter generation
persistent projects
local storage
streaming agent output
genre-specific prompt packs
vertical Japanese reading mode
richer PDF themes
user-editable story bible
side-by-side draft revision

Final thoughts

NovelPilot is my attempt to make AI fiction generation feel less like a chatbot and more like a writing room.

The core idea is simple:

One prompt. Nine agents. A complete story pipeline.

Gemma 4 is the reasoning engine behind the process. It plans, writes, edits, tracks foreshadowing, checks continuity, and packages the final story.

That is what makes NovelPilot more than a story generator.

It is an AI-powered novel production studio.

DEV Community