Prateek Mohan

Posted on Mar 31

I Built an AI That Teaches You DSA Through Faded Parsons Problems and Spaced Repetition

#ai #webdev #javascript #learning

By Prateek

So I've been deep in this project for a while now and I keep getting asked "what exactly does it do?" by friends, and I keep giving this rambling five-minute answer that never quite lands. So here's the written version. Maybe this one will stick.

The short version: I built a learning platform called PatternMaster that generates DSA lessons on demand, turns them into flashcards, and then makes you actually recall the material through faded Parsons problems, spaced repetition, and games. It also does visualizations. And it has a zombie shooter. I'll explain all of that.

But first — why?

The Problem With How I Was Learning DSA

I spent a lot of time watching NeetCode videos. Great videos. Genuinely. But I started noticing a pattern: I'd watch the explanation, follow along, understand every step perfectly, and then open a blank editor and... nothing. The information was there but it wasn't mine. I hadn't really built anything in my head — I'd just borrowed his mental model for 20 minutes and then returned it.

Reading editorial solutions was even worse. You read it line by line, you go "oh yes that makes sense, very clever," you close the tab, and three days later you couldn't reconstruct it from scratch.

The research calls this the illusion of knowing. You feel like you learned because understanding someone else's solution is easy. The hard part — and the part that actually builds skill — is construction. Building the solution yourself, from parts, under some form of retrieval pressure.

That's what I wanted to force. Not passive reading. Active construction.

The other piece: spacing. I've used Anki and it genuinely works, but it's boring and tedious to build decks for. I wanted the cards to just appear automatically, seeded from whatever I was studying, and for the scheduling to handle itself.

So I started building. Six months later here we are with something I'm actually proud of.

Section 1: The AI Unit Generation Pipeline

When you open the Learn page and type in a topic — "Two Sum", "Dijkstra's algorithm", "sliding window", whatever — the system needs to figure out what kind of content to generate before it even starts writing prompts.

This is handled by guessTopicType(), a function that is honestly kind of funny to look at. It's a 30+ regex monster that tries to figure out if you typed a LeetCode problem, a generic algorithm topic, a system design question, or — and yes, this is real — a history topic.

export function guessTopicType(topic: string): "leetcode" | "algorithm" | "system_design" | "history" | "general" | "coding" {
  const t = topic.toLowerCase().replace(/[^a-z0-9 ]/g, "");
  if (/\bleetcode\b/.test(t) || /\btwo.?sum\b/.test(t) || /\b3.?sum\b/.test(t)) {
    return "leetcode";
  }
  if (
    /\bsystem.?design\b/.test(t) ||
    /\bdesign\s+(a\s+|the\s+)?(twitter|uber|youtube|...|url\s*shortener|rate\s*limiter)\b/.test(t)
  ) {
    return "system_design";
  }
  if (
    /\b(history|historical|biography|ancient|medieval|renaissance|revolution)\b/.test(t) ||
    /\b(tokugawa|sengoku|napoleon|caesar|cleopatra|genghis)\b/.test(t)
  ) {
    return "history";
  }
  const codingPatterns = /\b(bfs|dfs|dijkstra|sort|search|tree|graph|heap|stack|queue|array|linked.?list|hash|dp|dynamic.programming|greedy|backtrack|recursion|trie|sliding.window|two.pointer)\b/;
  if (codingPatterns.test(t)) return "algorithm";
  // ...
  return "general";
}

The history path is not a joke. The "tokugawa" and "sengoku" entries are real, and yes, you can learn about the Tokugawa shogunate in PatternMaster. The non-coding lesson path generates concept slides, factual flashcards, and even logical ordering puzzles instead of code. It was a happy accident — I built the general path mostly as a fallback and it ended up working surprisingly well.

Once the topic type is known, the system dispatches to the right prompt builder, which fires a single unified AI call asking for everything at once:

Metadata: title, description, tags, difficulty
Lesson slides: 5 slides (concept → concept → analogy → concept → code)
Flashcards: 5-8 Q&A pairs
Main problem: LeetCode-style problem statement with constraints
Code: verified Python solution + test cases
Image prompts: 2 prompts for generating visual thumbnails

One call, one JSON blob, everything you need to render a full unit. It sounds like it'd be unreliable — and yeah, sometimes the JSON comes back slightly mangled, which is why jsonrepair from npm is doing a lot of quiet work behind the scenes. More on that later.

The unit detail view shows the slides, the generated code problem, and the exercises (Parsons). Clicking through the slides is actually kind of fun — the analogies the model comes up with are hit or miss but when they hit, they really help.

Section 2: The 5-Provider AI Broker

One thing I didn't want was to be locked to a single AI provider. Different models have different strengths, different pricing, different rate limits. So I built a broker layer.

The broker (server/ai/broker.ts) routes requests to: Gemini, OpenAI (GPT-4/GPT-5), Claude, Groq, and local Ollama. Each "phase" of the app (unit generation, tutor chat, image generation, review card generation) can be configured to use a different provider.

A few things I'm genuinely proud of in this module:

In-flight dedup. Early on I noticed the network tab showing 5 identical POST requests going out simultaneously when someone opened a page. Classic React useEffect + StrictMode double-firing + impatient users = many duplicate requests. The fix was a simple in-flight Map:

const inFlight = new Map<string, Promise<BrokerResponse>>();

// ...inside handleBrokerComplete:
const dedupKey = dedupeKey(mode, phase, subPhase, promptHash, imageHash);

if (inFlight.has(dedupKey)) {
  console.info(`[broker] Dedup hit for ${dedupKey}`);
  return inFlight.get(dedupKey)!;
}

// ...after building the promise:
inFlight.set(dedupKey, responsePromise);
try {
  const response = await responsePromise;
  return response;
} finally {
  inFlight.delete(dedupKey);
}

Identical concurrent requests with the same prompt hash get collapsed into one actual API call. Everyone waiting on that key gets the same response. This cut API costs by something like 30-40% in testing.

Free tier support. No API key? No problem. The server has a fallback:

const FREE_TIER_MODEL = "gemini-3.1-flash-lite-preview";

export function isFreeTier(aiSettings: AISettings): boolean {
  const hasKey = (k: unknown) => typeof k === "string" && k.trim().length > 0;
  return !(
    hasKey(aiSettings.geminiKey) ||
    hasKey(aiSettings.openaiKey) ||
    hasKey(aiSettings.claudeKey) ||
    hasKey(aiSettings.groqKey)
  );
}

Free tier users get routed to gemini-3.1-flash-lite-preview via the server's key. It's slower and less capable but it works. Building for free tier users genuinely changed the architecture — you have to trim prompts, be careful about token limits, and set different timeout expectations. It's a different class of user and they deserve a working experience, not just an error page.

You can configure your own keys for any provider, set which provider handles which phase, and even point it at a local Ollama instance if you're running Llama locally and don't want any API calls leaving your machine.

Section 3: Faded Parsons Problems

This is the feature I'm most proud of and the one that required the most grinding to get right.

A Parsons problem is a code comprehension exercise where you're given the correct lines of code, scrambled, and have to put them back in the right order. The research backing this is solid — constructing a solution from parts engages retrieval in a way that reading doesn't. Faded Parsons takes it further: some blocks have blanks that you have to fill in. You're not just ordering — you're also recalling specific syntax and logic.

Here's the prompt that asks the AI to generate blocks from verified code:

export function buildParsonsPrompt(verifiedCode: string, topic: string): string {
  return `Split this code into Parsons blocks. Topic: "${topic}". Do NOT change logic.

\`\`\`python
${verifiedCode}
\`\`\`

Return ONLY JSON: {"blocks":[...],"solution_order":["p1","p2",...]}
Block shape: {"id":"p1","text":"code line","indent":0,"isBlank":false,"groupTag":"g1"}
Blank blocks add: "placeholder":"hint","answer":"exact code"

Rules: 4-6 groupTags, 2-3 blank blocks, 1 distractor (isDistractor:true, NOT in solution_order). IDs: p1,p2,... Indent must match original.`;
}

The key constraints: indent must be preserved exactly (Python), there must be a distractor block that isn't part of the solution, and 2-3 blocks must be "faded" with a hint placeholder and the exact answer stored separately for verification.

Once you assemble the blocks in the right order and fill in the blanks, the assembled code gets run in Pyodide — a full CPython interpreter compiled to WebAssembly running in your browser. Zero server calls for code execution. It's genuinely magical. It also has about a 4-second startup cost the first time, which I handle by pre-warming it in the background when the page loads. Worth it.

When the AI Parsons generation fails (and it does, sometimes — the JSON comes back malformed or the block count is off), there's a mechanical fallback that generates blocks deterministically from the code:

function pickMechanicalBlankIndexes(lines: string[]): Set<number> {
  const result = new Set<number>();
  const maxBlanks = lines.length >= 6 ? 3 : lines.length >= 3 ? 2 : 1;
  const blocked = /^\s*(def|class|from|import|@)\b/;

  const collect = (matcher: RegExp) => {
    for (let i = 0; i < lines.length && result.size < maxBlanks; i += 1) {
      if (result.has(i)) continue;
      const line = lines[i];
      if (blocked.test(line)) continue;
      if (matcher.test(line)) result.add(i);
    }
  };

  collect(/^\s*return\b/);
  collect(/^\s*(if|elif|for|while)\b.*:\s*$/);
  collect(/^\s*[a-zA-Z_][a-zA-Z0-9_]*\s*=\s*.+/);
  collect(/\S/);

  return result;
}

The priority order: return statements first (high recall value), then control flow, then assignments, then anything. It sounds simple but getting the indent handling right when you're reconstructing Python code from string manipulation was genuinely annoying. Python cares about indentation in a way most languages don't and the mechanical assembler has to get it exactly right or Pyodide will throw a syntax error.

For non-coding topics, the Parsons problem becomes a logical ordering puzzle of factual statements. "Put these events in chronological order" or "arrange these steps of the Krebs cycle" — same UI, different content.

Section 4: SM-2 Spaced Repetition

The review system uses SM-2 — the algorithm behind Anki — via the supermemo npm package.

Cards live in two phases:

Learn phase: You haven't seen it enough yet. The card keeps showing up until you've reviewed it 10 times. During this phase the SM-2 algorithm isn't really active — the card just cycles back immediately regardless of how you graded it. Think of it as the initial memorization pass.

Review phase: After 10 reviews, the card graduates. Now SM-2 takes over. Your grade affects the ease factor and interval:

import { supermemo, type SuperMemoItem, type SuperMemoGrade } from 'supermemo';

function gradeToSuperMemo(grade: ReviewGrade): SuperMemoGrade {
  if (grade === 'again') return 1;
  if (grade === 'hard') return 2;
  if (grade === 'good') return 4;
  return 5; // easy
}

function applySM2(item: SuperMemoItem, grade: SuperMemoGrade): SuperMemoItem {
  const result = supermemo(item, grade);
  const MAX_INTERVAL = 730;
  return {
    ...result,
    interval: Number.isFinite(result.interval) ? Math.min(result.interval, MAX_INTERVAL) : 1,
    efactor: Number.isFinite(result.efactor) ? Math.max(result.efactor, 1.3) : 2.5,
  };
}

// In learn phase:
if (prev.phase === 'learn') {
  const seenCount = prev.seenCount + 1;
  const hasGraduated = seenCount >= 10;
  return {
    reviewCardState: {
      ...state.reviewCardState,
      [cardId]: {
        ...prev,
        phase: hasGraduated ? 'review' : 'learn',
        seenCount,
        dueAt: hasGraduated ? now + 24 * 60 * 60 * 1000 : now,
        interval: hasGraduated ? 1 : 0,
        repetition: hasGraduated ? 1 : 0,
        lastReviewedAt: now,
      }
    }
  };
}

There's a MAX_INTERVAL = 730 cap — two years. Without it, very easy cards would eventually schedule themselves out 5+ years and you'd never see them again. Which sounds fine until you realize you forgot the thing and the next review is in 2031.

The efactor floor of 1.3 prevents cards from becoming so hard they're practically daily — if something is that difficult, the intervals shrink but they don't collapse to zero.

Cards auto-seed from the flashcards generated in each unit. Complete a unit, get flashcards in your review queue, no manual deck building required. They sync bidirectionally to the server so your progress persists across devices.

The card management view lets you see what's in your queue, which phase cards are in, and manually adjust due dates if something is wildly off.

Section 5: Review Generation at 3 Cognitive Depths

When you ask the AI to generate additional review cards beyond the ones seeded from your units, it uses a prompt that explicitly targets three cognitive levels:

CARD QUALITY RULES:
3) Vary cognitive depth across cards:
   - ~40% recall: "What is X?" / "Define Y"
   - ~30% comprehension: "Why does X work?" / "How does X differ from Y?"
   - ~30% application: "When would you use X?" / "What happens if you apply X to Z?"

This maps roughly to the lower three levels of Bloom's taxonomy — remember, understand, apply. The recall cards are fast and easy. The comprehension cards require you to actually understand the mechanism, not just the label. The application cards are the ones that actually prepare you for interviews, because interviews ask "when would you use a min-heap here?" not "define a min-heap."

It sounds like a small prompt tweak but it genuinely changes the character of a generated deck. Without the constraint, the model defaults to 90% "What is X? / Define Y" cards, which are fine but don't build the depth you need.

Section 6: Remotion Visualizations

The visualize page is probably the most visually impressive part of the project and the one that people ask about most.

The idea: type "explain merge sort" and get an animated video showing merge sort running step by step, with elements moving, colors changing, and labels appearing. No video editor. No manual keyframing. Just AI-generated JSON rendered by a React-based animation engine.

The AI generates a RemotionSceneSpec — a JSON object with:

elements[]: up to 13 different element types: box, circle, label, bar, gear, svg-path, ring, hexagon, triangle, diamond, wave, particle-burst, progress-arc
connections[]: arrows between elements (straight or curved, with optional animated flow)
steps[]: discrete steps that specify exactly what changes — moves, color changes, highlights, scale changes, rotations, elements being added or removed

DynamicComposition.tsx renders this spec as a smooth animation, interpolating positions and colors between steps over 18 transition frames. The interpolation is the piece that makes it feel like a real animation rather than a slideshow.

The gallery has both CS/DSA visualizations (BubbleSort, LinkedList, HashTable, QueueViz, BinarySearch) and "real world" ones like CPUPipeline, DNAReplication, SolarSystem, and TrafficLight. The real world ones were built to test the robustness of the schema — can it handle non-CS domains? Mostly yes.

The storyboard view shows each step as a frame, which is useful for debugging when the AI generates a step that moves elements off-screen or produces an animation that makes no visual sense. Which, in my experience, happens maybe 20% of the time. The other 80% is genuinely impressive.

Section 7: What I Actually Learned Building This

JSON repair is a legitimate engineering problem. The jsonrepair package from npm is doing quiet heroic work throughout this codebase. AI models produce almost valid JSON constantly. Trailing commas, unescaped quotes in string values, truncated responses that cut off mid-array — all of it. jsonrepair handles maybe 70% of these cases silently. The remaining 30% require a retry with a stricter prompt. I have a whole retry + repair pipeline that I initially thought was overkill and now think is the bare minimum.

Pyodide is genuinely magical but demanding. Running CPython in the browser via WebAssembly is one of those things that shouldn't work but does. The startup cost (~4 seconds on first load, ~200ms after that) is real but manageable if you pre-warm. The thing that surprised me is how complete it is — you can import collections, itertools, heapq, basically anything from the standard library, and it just works. Made the code verification story actually viable.

The dedup map was added after watching the network tab. I didn't design it upfront. I saw the problem happening in production (5 identical requests going out in 100ms) and fixed it. The lesson: instrument your network calls early. You will be surprised what you find.

Mechanical Parsons was more work than expected. I thought "just split by newline and pick some to blank out" would take an afternoon. Python indentation meant the assembled code had to have exact whitespace or Pyodide would reject it. Then there's the distractor generation — you can't just copy a line and call it a distractor, it has to be plausible but wrong. Eventually I settled for shallow distractors (a close variant of an existing line) rather than trying to generate genuinely misleading ones.

Non-coding topics work better than I expected. The history and general paths were built as fallbacks. I honestly expected them to produce mediocre content. But the pipeline generalizes better than I thought — the flashcard generation is just as good for "explain the Meiji Restoration" as for "explain BFS." The main difference is no code execution, which actually simplifies things.

Free tier users force you to be honest about your product. When you're building with your own API key, you don't feel the token costs or the latency. The moment you have to serve users on a free tier model with smaller prompts and slower responses, you find out which parts of your product are genuinely valuable and which are just impressive demos. It was humbling in a useful way.

What's Next

The games are next. Review cards aren't just for the review page — they feed into three different games:

Zombie FPS (/play/hyper): You're in a 3D graveyard. Questions appear as floating text. Shoot the zombie whose body zone maps to the correct answer (zones A-E). The physics of urgently trying to recall "which one is O(n log n)" while a zombie runs at you works surprisingly well as a recall forcing function.
Endless Runner (/play/runner): Side-scrolling runner where question gates block your path. Answer correctly to pass through, wrong answer costs health.
Tetris (/play/tetris): Pieces are labeled with algorithm names. Clearing lines requires putting the right pieces in the right spots.

Multiplayer review sessions are coming too — race a friend through your shared flashcard deck, see who answers faster. The infrastructure is there (WebSocket room system is already built), I just need to wire the review UI to it.

More visualization types: I want to add network graphs (for system design), timing diagrams (for concurrency), and memory layout visualizations (stack vs heap). The current schema handles most of these but not all.

If you want to try it, you can run it locally — it's a standard React + Express monorepo, npm run dev and you're up. I'll have a hosted version up soon.

The core loop — AI Tutor → Lessons → Review Cards → Games → Progress — actually works. I've been using it myself to fill in gaps in my DSA knowledge and it's genuinely more effective than watching videos. Which is the whole point.

PatternMaster is open source. Check the repo if you want to dig into any of the pieces above in more detail.

DEV Community