Building Frankenstein: How I Stitched Together 44 Technologies Into an AI Story Generator
A technical deep-dive into building a full-stack application that transforms user inputs into immersive, multimedia children's stories
The Origin Story
I used to write a lot of books. Horror novels, mostly. I'd spend sleepless nights at my mechanical keyboard (the ones without that satisfying clang give me the chills!) pulling stories from my brain onto the page before they vanished into the ether.
Then life happened. The creative writing slowed down, replaced by corporate repositories and JSP-to-React migrations. But I never stopped missing that creative outlet.
This project solved both problems: I got to design something ambitious and crazy while scratching that storytelling itch. Win-win.
What I Built
Frankenstein is a full-stack AI-powered children's story generator that creates complete multimedia storybooks in under 3 minutes. Here's what makes it special:
- AI-Generated Content: Claude Sonnet 4.5 writes 10-15 page stories with moral themes
- Custom Illustrations: Stability AI generates unique images for each page
- Professional Narration: ElevenLabs creates synchronized audio with word-level text highlighting
- 3D Book Interface: Three.js renders a realistic page-turning experience
- Real-Time Updates: WebSocket progress tracking with literary quotes and jokes
- Full Accessibility: WCAG 2.1 Level AA compliant with keyboard navigation and screen readers
The entire experience feels like watching a movie, but you're creating it.
The Tech Stack: A True Chimera
This project lives up to its "Frankenstein" name by stitching together 44 different technologies:
Backend (10 Technologies)
// Spring Boot 3.5.0 with Java 21
@SpringBootApplication
public class FrankensteinApplication {
public static void main(String[] args) {
SpringApplication.run(FrankensteinApplication.class, args);
}
}
- Spring Boot 3.5.0 + Spring AI 1.0.0-M4
- Spring WebSocket (STOMP protocol)
- Spring Boot Actuator
- Maven, Jackson, Lombok, SLF4J
- Custom DotenvConfig for environment variables
- RestClient with 3-minute timeouts for AI APIs
Frontend (14 Technologies)
// React 18 with TypeScript and Vite
import { StrictMode } from 'react'
import { createRoot } from 'react-dom/client'
import App from './App'
import './index.css'
createRoot(document.getElementById('root')!).render(
<StrictMode>
<App />
</StrictMode>,
)
- React 18, TypeScript, Vite 5, Tailwind CSS 3.4
- Framer Motion, GSAP, React Spring, Lottie (animations)
- Three.js + React Three Fiber (3D graphics)
- Howler.js (audio), tsParticles (effects)
- Zustand (state), React Router (navigation)
- React Hook Form + Zod (validation)
- Radix UI, Aceternity UI, React Hot Toast
AI Services (3 Paid APIs)
- Anthropic Claude: Story generation (~$0.015/story)
- Stability AI: Image generation (~$0.08/story)
- ElevenLabs: Audio narration (~$0.30/story)
Enhancement APIs (7 Free)
- ZenQuotes, Sunrise-Sunset, Advice Slip, JokeAPI
- Random User, Bored API, Agify (ready to integrate)
Development Tools (5)
- ESLint, Prettier, Vitest, React Testing Library, Playwright
Total: 44 technologies working in harmony
Architecture Overview
The application follows a clean separation of concerns with async processing and real-time updates:
┌─────────────────────────────────────────────────┐
│ React Frontend (TypeScript + Vite) │
│ • Form input with suggestions │
│ • Real-time progress display │
│ • 3D book with synchronized audio │
│ • Story library with accessibility │
└──────────────────┬──────────────────────────────┘
│ REST + WebSocket (STOMP)
┌──────────────────┴──────────────────────────────┐
│ Spring Boot Backend (Java 21) │
│ • Story orchestration service │
│ • Parallel image generation │
│ • Batched audio generation (3 concurrent max) │
│ • File-based storage │
│ • API cost tracking │
└──────────────────┬──────────────────────────────┘
│ External API Calls
┌──────────────────┴──────────────────────────────┐
│ AI Services │
│ • Anthropic Claude (story text) │
│ • Stability AI (images) │
│ • ElevenLabs (narration) │
└──────────────────────────────────────────────────┘
Key Design Decisions
1. Async Processing with CompletableFuture
Story generation happens asynchronously to avoid blocking the main thread:
@Service
@RequiredArgsConstructor
public class StoryOrchestrationService {
private final StoryGenerationService storyGenerationService;
private final ImageOrchestrationService imageOrchestrationService;
private final AudioOrchestrationService audioOrchestrationService;
@Async
public CompletableFuture<Story> generateStoryAsync(StoryInput input) {
// Phase 1: Generate outline (5%)
StoryStructure outline = storyGenerationService.generateOutline(input);
progressNotificationService.sendProgress(storyId, 5, "GENERATING_OUTLINE");
// Phase 2: Generate full story (30%)
Story story = storyGenerationService.generateFullStory(input, outline);
progressNotificationService.sendProgress(storyId, 30, "GENERATING_STORY");
// Phase 3: Generate images in parallel (80%)
CompletableFuture<Void> imagesFuture = imageOrchestrationService
.generateAllImages(story);
// Phase 4: Generate audio in batches (100%)
CompletableFuture<Void> audioFuture = audioOrchestrationService
.generateAllAudio(story);
// Wait for all assets
CompletableFuture.allOf(imagesFuture, audioFuture).join();
return CompletableFuture.completedFuture(story);
}
}
2. Rate Limiting for ElevenLabs
The audio service implements batch processing to respect API rate limits:
@Service
@RequiredArgsConstructor
public class AudioOrchestrationServiceImpl implements AudioOrchestrationService {
private static final int MAX_CONCURRENT_REQUESTS = 3;
@Override
public CompletableFuture<Void> generateAllAudio(Story story) {
List<StoryPage> pages = story.getPages();
List<CompletableFuture<Void>> futures = new ArrayList<>();
// Process in batches of 3
for (int i = 0; i < pages.size(); i += MAX_CONCURRENT_REQUESTS) {
int end = Math.min(i + MAX_CONCURRENT_REQUESTS, pages.size());
List<StoryPage> batch = pages.subList(i, end);
// Generate audio for batch
List<CompletableFuture<Void>> batchFutures = batch.stream()
.map(page -> CompletableFuture.runAsync(() ->
audioGenerationService.generateNarration(story.getId(), page)))
.toList();
// Wait for batch to complete before starting next
CompletableFuture.allOf(batchFutures.toArray(new CompletableFuture[0])).join();
}
return CompletableFuture.completedFuture(null);
}
}
3. WebSocket for Real-Time Progress
Users see live updates during the 2-3 minute generation process:
@Service
@RequiredArgsConstructor
public class ProgressNotificationService {
private final SimpMessagingTemplate messagingTemplate;
public void sendProgress(String storyId, int progress, String stage) {
GenerationProgress progressUpdate = GenerationProgress.builder()
.storyId(storyId)
.progress(progress)
.stage(stage)
.timestamp(LocalDateTime.now())
.build();
messagingTemplate.convertAndSend(
"/topic/story-progress/" + storyId,
progressUpdate
);
}
}
Frontend subscription:
const client = new Client({
brokerURL: 'ws://localhost:8083/ws/story-progress',
onConnect: () => {
client.subscribe(`/topic/story-progress/${storyId}`, (message) => {
const progress = JSON.parse(message.body);
setProgress(progress.progress);
setStage(progress.stage);
});
},
});
Top comments (0)