LazyDev_OH

Posted on Apr 15 • Originally published at gocodelab.com

Teaching Claude to Play Tetris with 100 App Store Characters

#ai #claude #typescript #showdev

The App Store keyword field is exactly 100 characters. Commas only, no spaces, no duplicates. You need to pack 15–20 keywords inside.

I tried writing those by hand for a dozen apps. Every time I'd leave characters on the table — a rogue space after a comma, a singular/plural duplicate Apple would auto-match anyway. Manual packing is tedious enough that most indie developers just don't iterate on ASO.

So I built an AI that does it. This post is the actual implementation — prompts, JSON schemas, validation, and the gotchas that killed my first three attempts. I ship this in my ASO tool for iOS developers (Apsity), but the approach works for any tight-constraint text-generation problem.

The Constraints That Break Generic LLMs

When you ask any LLM "generate App Store keywords for my budget app," you get something like:

budget tracker, expense manager, spending analysis,
money manager, personal finance, bill tracker

Readable. Useless. Two characters wasted on every , (space after comma). Four characters wasted on personal finance because Apple auto-matches personal + finance separately. Total wasted: roughly 30% of your 100.

The rules that matter:

Exactly ≤100 characters (including commas)
Single comma separators, no spaces
No duplicate tokens (Apple ignores them anyway)
No singular+plural pairs (Apple auto-matches)
Shorter tokens > compound words (Apple combines them for you)
No competitor brand names (trademark rejection)
No category names, and no app, free, new, best, iPhone, iPad (Apple auto-indexes all of these)
Mix function + situation + alternative keywords

An LLM without these constraints spelled out won't enforce them. Generic "write keywords" prompts fail rules 1–4 consistently.

Why Claude Sonnet

I tested GPT-5, Gemini 2.0 Pro, and Claude Sonnet 4.6 on the same task. Three metrics:

Character compliance — stays under 100 chars without excess whitespace
JSON schema adherence — returns exactly the structured output I asked for
Edge case handling — catches duplicates, plural forms, category name leaks

Claude Sonnet won on all three, but the meaningful gap was edge case handling. When I explicitly said "no duplicates including singular/plural pairs," Claude filtered them out. The others listed budget and budgets and called it done — which is wrong, because Apple's algorithm auto-indexes plurals from the singular form anyway. A keyword duplicated across singular/plural just wastes characters.

I'm also passing a lot of context — competitor review snippets, current rankings, market-specific search trends. Sonnet 4.6's 1M-token context window handles it without trimming.

The Prompt Structure

The prompt is in three layers: system prompt (the rules), user prompt (the app context), and a JSON schema Claude must match.

// lib/keyword-generator.ts
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const SYSTEM_PROMPT = `
You are an ASO (App Store Optimization) keyword specialist.
Generate keywords for the app store "Keywords" field, which has
a STRICT 100-character limit. Characters include commas.

Rules (apply in order):
1. Total output length MUST be ≤100 characters
2. Use ONLY commas as separators, no spaces after commas
3. No duplicate tokens
4. No singular+plural pairs (Apple auto-matches both)
5. Prefer short atomic tokens over compound words
   (Apple combines A + B into "A B" automatically)
6. No competitor brand names (trademark violation)
7. No category names and no words Apple already indexes automatically:
   app, free, new, best, iPhone, iPad, or any category label
8. Blend three keyword types:
   - Function (what the app does)
   - Situation (when users need it)
   - Alternative (different names for the same thing)

Return JSON with this schema:
{
  "keywords": string[],         // individual tokens, no commas inside
  "joined": string,             // comma-joined, must be ≤100 chars
  "char_count": number,         // .length of "joined"
  "coverage_notes": string[]    // which search queries this covers
}
`;

type KeywordOutput = {
  keywords: string[];
  joined: string;
  char_count: number;
  coverage_notes: string[];
};

The JSON schema isn't just for structure. char_count forces Claude to count the output itself — models aren't great at counting, but self-reporting forces a pass where the model checks its own work.

Generating Keywords

async function generateKeywords(context: {
  app_name: string;
  description: string;
  competitors: string[];
  existing_keywords?: string[];
  target_market: string;
}): Promise<KeywordOutput> {
  const userPrompt = `
App: ${context.app_name}
Description: ${context.description}
Target market: ${context.target_market}
Competitor apps (do NOT use these names): ${context.competitors.join(", ")}
${context.existing_keywords ? `Currently underperforming keywords to replace: ${context.existing_keywords.join(", ")}` : ""}

Generate an optimal 100-character keyword field.
Before finalizing, count your characters and confirm it fits.
`;

  const response = await client.messages.create({
    model: "claude-sonnet-4-6",
    max_tokens: 1024,
    system: SYSTEM_PROMPT,
    messages: [{ role: "user", content: userPrompt }],
  });

  const text = response.content[0].type === "text"
    ? response.content[0].text
    : "";

  const match = text.match(/\{[\s\S]*\}/);
  if (!match) throw new Error("No JSON in response");

  return JSON.parse(match[0]) as KeywordOutput;
}

Straightforward Anthropic SDK call. Two things worth noting:

max_tokens: 1024 — keywords are short, so we don't need more. Capping reduces cost.
JSON extraction via regex — Claude sometimes wraps JSON in explanation text. Grabbing the first {...} block is more reliable than asking for raw JSON.

Validation Is Where Production Code Lives

Claude gets the constraints right ~85% of the time. Production code has to handle the other 15%.

// lib/validate-keywords.ts
import { z } from "zod";

const KeywordSchema = z.object({
  keywords: z.array(z.string()),
  joined: z.string(),
  char_count: z.number(),
  coverage_notes: z.array(z.string()),
});

export function validateKeywords(output: unknown): {
  ok: boolean;
  issues: string[];
  data?: KeywordOutput;
} {
  const parsed = KeywordSchema.safeParse(output);
  if (!parsed.success) {
    return { ok: false, issues: ["invalid JSON shape"] };
  }

  const issues: string[] = [];
  const { keywords, joined, char_count } = parsed.data;

  // 1. Length check
  if (joined.length > 100) {
    issues.push(`joined is ${joined.length} chars, exceeds 100`);
  }

  // 2. Trust but verify char_count
  if (joined.length !== char_count) {
    issues.push(`char_count mismatch: claimed ${char_count}, actual ${joined.length}`);
  }

  // 3. Commas only, no spaces
  if (joined.includes(", ")) {
    issues.push("contains ', ' — spaces after commas waste characters");
  }

  // 4. Reconstruct and compare
  const reconstructed = keywords.join(",");
  if (reconstructed !== joined) {
    issues.push("keywords array doesn't match joined string");
  }

  // 5. Duplicate detection (case-insensitive)
  const seen = new Set<string>();
  for (const k of keywords) {
    const lower = k.toLowerCase();
    if (seen.has(lower)) {
      issues.push(`duplicate token: ${k}`);
    }
    seen.add(lower);
  }

  // 6. Singular/plural detection (basic)
  for (const k of keywords) {
    const plural = k.toLowerCase() + "s";
    const singular = k.toLowerCase().replace(/s$/, "");
    if (seen.has(plural) && k.toLowerCase() !== plural) {
      issues.push(`singular/plural pair: ${k} / ${k}s`);
    }
  }

  return { ok: issues.length === 0, issues, data: parsed.data };
}

When validation fails, I retry with the specific issue appended to the prompt:

async function generateWithRetry(
  context: KeywordContext,
  attempt = 1,
): Promise<KeywordOutput> {
  if (attempt > 3) throw new Error("Failed after 3 attempts");

  const result = await generateKeywords(context);
  const check = validateKeywords(result);

  if (check.ok) return check.data!;

  // Feed issues back to Claude for a targeted retry
  return generateWithRetry(
    {
      ...context,
      existing_keywords: result.keywords,
      // Add validation issues into a correction prompt here
    },
    attempt + 1,
  );
}

In practice, 94% succeed on the first attempt, 5% on the second, 1% fall through (usually when the concept genuinely can't fit in 100 chars — time to simplify the app description, not the prompt).

The Output Nobody Asks For But Everyone Needs

The coverage_notes field in the schema looks optional. It's the most useful part.

{
  "keywords": ["budget","expense","payday","wallet","debt","bills","money","savings"],
  "joined": "budget,expense,payday,wallet,debt,bills,money,savings",
  "char_count": 51,
  "coverage_notes": [
    "Matches: 'budget', 'expense tracker', 'payday planner', 'wallet app'",
    "Covers 'money management' via money + bills combo",
    "Skipped 'finance' because it's the category — App Store auto-indexes that",
    "Skipped 'mint' (Mint.com trademark)"
  ]
}

Now the app developer can audit why each keyword was picked. When someone asks "why isn't my app showing up for X?" you have a record. Without coverage_notes, the output is a black box.

Prompt Failures I Hit Along the Way

Attempt 1: "Generate 15-20 keywords under 100 characters." Result: the model wrote a nice list, counted wrong, and delivered 112 characters. No self-verification step.

Attempt 2: Added "Do not exceed 100 characters" — model now refused to output more than 10 keywords to stay safe. Under-coverage.

Attempt 3: JSON schema with char_count field. Model started counting. Characters dropped into range but duplicates appeared.

Attempt 4 (shipped): Enumerated every rule with "apply in order," asked for coverage_notes to force reasoning, and added validation with retry.

Each failure mode came from underspecifying the rules. The LLM isn't "wrong" — it's doing exactly what the prompt asked. Getting production-grade output means writing the prompt like a spec, not a request.

Where This Lives Now

I packaged this into Apsity's AI Growth Agent — it runs on every keyword field update across the apps it tracks, compares against real-time search rankings, and flags underperforming tokens for replacement. Free tier covers 1 app and 5 keywords if you want to poke at it.

More importantly, the pattern generalizes. Any time you have "generate text inside tight constraints" — tweet drafts with character limits, SMS messages, ad headlines, product names — the structure is the same:

Enumerate every constraint as a numbered rule
Force a JSON schema with self-reported metrics
Ask for a reasoning field so you can audit
Validate in code, feed failures back for retry

Writing the spec as a prompt beats writing it as docs — because you can actually run it.

Originally written for GoCodeLab. Deeper writeups on building indie SaaS with Claude are in the Lazy Developer series.

DEV Community