Alarinel

Posted on Nov 17

Kiro Development Experience #Kiro

#kiro #ai #java #react

Building Frankenstein: How I Stitched Together 44 Technologies Into an AI Story Generator

A technical deep-dive into building a full-stack application that transforms user inputs into immersive, multimedia children's stories

The Origin Story

I used to write a lot of books. Horror novels, mostly. I'd spend sleepless nights at my mechanical keyboard (the ones without that satisfying clang give me the chills!) pulling stories from my brain onto the page before they vanished into the ether.

Then life happened. The creative writing slowed down, replaced by corporate repositories and JSP-to-React migrations. But I never stopped missing that creative outlet.

This project solved both problems: I got to design something ambitious and crazy while scratching that storytelling itch. Win-win.

What I Built

Frankenstein is a full-stack AI-powered children's story generator that creates complete multimedia storybooks in under 3 minutes. Here's what makes it special:

AI-Generated Content: Claude Sonnet 4.5 writes 10-15 page stories with moral themes
Custom Illustrations: Stability AI generates unique images for each page
Professional Narration: ElevenLabs creates synchronized audio with word-level text highlighting
3D Book Interface: Three.js renders a realistic page-turning experience
Real-Time Updates: WebSocket progress tracking with literary quotes and jokes
Full Accessibility: WCAG 2.1 Level AA compliant with keyboard navigation and screen readers

The entire experience feels like watching a movie, but you're creating it.

The Tech Stack: A True Chimera

This project lives up to its "Frankenstein" name by stitching together 44 different technologies:

Backend (10 Technologies)

// Spring Boot 3.5.0 with Java 21
@SpringBootApplication
public class FrankensteinApplication {
    public static void main(String[] args) {
        SpringApplication.run(FrankensteinApplication.class, args);
    }
}

Spring Boot 3.5.0 + Spring AI 1.0.0-M4
Spring WebSocket (STOMP protocol)
Spring Boot Actuator
Maven, Jackson, Lombok, SLF4J
Custom DotenvConfig for environment variables
RestClient with 3-minute timeouts for AI APIs

Frontend (14 Technologies)

// React 18 with TypeScript and Vite
import { StrictMode } from 'react'
import { createRoot } from 'react-dom/client'
import App from './App'
import './index.css'

createRoot(document.getElementById('root')!).render(
  <StrictMode>
    <App />
  </StrictMode>,
)

React 18, TypeScript, Vite 5, Tailwind CSS 3.4
Framer Motion, GSAP, React Spring, Lottie (animations)
Three.js + React Three Fiber (3D graphics)
Howler.js (audio), tsParticles (effects)
Zustand (state), React Router (navigation)
React Hook Form + Zod (validation)
Radix UI, Aceternity UI, React Hot Toast

AI Services (3 Paid APIs)

Anthropic Claude: Story generation (~$0.015/story)
Stability AI: Image generation (~$0.08/story)
ElevenLabs: Audio narration (~$0.30/story)

Enhancement APIs (7 Free)

ZenQuotes, Sunrise-Sunset, Advice Slip, JokeAPI
Random User, Bored API, Agify (ready to integrate)

Development Tools (5)

ESLint, Prettier, Vitest, React Testing Library, Playwright

Total: 44 technologies working in harmony

Architecture Overview

The application follows a clean separation of concerns with async processing and real-time updates:

┌─────────────────────────────────────────────────┐
│  React Frontend (TypeScript + Vite)             │
│  • Form input with suggestions                  │
│  • Real-time progress display                   │
│  • 3D book with synchronized audio              │
│  • Story library with accessibility             │
└──────────────────┬──────────────────────────────┘
                   │ REST + WebSocket (STOMP)
┌──────────────────┴──────────────────────────────┐
│  Spring Boot Backend (Java 21)                  │
│  • Story orchestration service                  │
│  • Parallel image generation                    │
│  • Batched audio generation (3 concurrent max)  │
│  • File-based storage                           │
│  • API cost tracking                            │
└──────────────────┬──────────────────────────────┘
                   │ External API Calls
┌──────────────────┴──────────────────────────────┐
│  AI Services                                     │
│  • Anthropic Claude (story text)                │
│  • Stability AI (images)                        │
│  • ElevenLabs (narration)                       │
└──────────────────────────────────────────────────┘

Key Design Decisions

1. Async Processing with CompletableFuture

Story generation happens asynchronously to avoid blocking the main thread:

@Service
@RequiredArgsConstructor
public class StoryOrchestrationService {
    private final StoryGenerationService storyGenerationService;
    private final ImageOrchestrationService imageOrchestrationService;
    private final AudioOrchestrationService audioOrchestrationService;

    @Async
    public CompletableFuture<Story> generateStoryAsync(StoryInput input) {
        // Phase 1: Generate outline (5%)
        StoryStructure outline = storyGenerationService.generateOutline(input);
        progressNotificationService.sendProgress(storyId, 5, "GENERATING_OUTLINE");

        // Phase 2: Generate full story (30%)
        Story story = storyGenerationService.generateFullStory(input, outline);
        progressNotificationService.sendProgress(storyId, 30, "GENERATING_STORY");

        // Phase 3: Generate images in parallel (80%)
        CompletableFuture<Void> imagesFuture = imageOrchestrationService
            .generateAllImages(story);

        // Phase 4: Generate audio in batches (100%)
        CompletableFuture<Void> audioFuture = audioOrchestrationService
            .generateAllAudio(story);

        // Wait for all assets
        CompletableFuture.allOf(imagesFuture, audioFuture).join();

        return CompletableFuture.completedFuture(story);
    }
}

2. Rate Limiting for ElevenLabs

The audio service implements batch processing to respect API rate limits:

@Service
@RequiredArgsConstructor
public class AudioOrchestrationServiceImpl implements AudioOrchestrationService {
    private static final int MAX_CONCURRENT_REQUESTS = 3;

    @Override
    public CompletableFuture<Void> generateAllAudio(Story story) {
        List<StoryPage> pages = story.getPages();
        List<CompletableFuture<Void>> futures = new ArrayList<>();

        // Process in batches of 3
        for (int i = 0; i < pages.size(); i += MAX_CONCURRENT_REQUESTS) {
            int end = Math.min(i + MAX_CONCURRENT_REQUESTS, pages.size());
            List<StoryPage> batch = pages.subList(i, end);

            // Generate audio for batch
            List<CompletableFuture<Void>> batchFutures = batch.stream()
                .map(page -> CompletableFuture.runAsync(() -> 
                    audioGenerationService.generateNarration(story.getId(), page)))
                .toList();

            // Wait for batch to complete before starting next
            CompletableFuture.allOf(batchFutures.toArray(new CompletableFuture[0])).join();
        }

        return CompletableFuture.completedFuture(null);
    }
}

3. WebSocket for Real-Time Progress

Users see live updates during the 2-3 minute generation process:

@Service
@RequiredArgsConstructor
public class ProgressNotificationService {
    private final SimpMessagingTemplate messagingTemplate;

    public void sendProgress(String storyId, int progress, String stage) {
        GenerationProgress progressUpdate = GenerationProgress.builder()
            .storyId(storyId)
            .progress(progress)
            .stage(stage)
            .timestamp(LocalDateTime.now())
            .build();

        messagingTemplate.convertAndSend(
            "/topic/story-progress/" + storyId, 
            progressUpdate
        );
    }
}

Frontend subscription:

const client = new Client({
  brokerURL: 'ws://localhost:8083/ws/story-progress',
  onConnect: () => {
    client.subscribe(`/topic/story-progress/${storyId}`, (message) => {
      const progress = JSON.parse(message.body);
      setProgress(progress.progress);
      setStage(progress.stage);
    });
  },
});

DEV Community