FlipRead

Posted on Jun 28

How I Built a Realistic Page-Flip Engine in the Browser — and Wired It to an AI API

#ai #webdev #showdev #javascript

A solo developer's technical deep-dive into rendering, multi-format document parsing, and AI storybook generation — the full stack behind FlipFlow, a free online flipbook maker.

TL;DR

I built FlipFlow — a web app that converts PDF, PPT, Word, and images into interactive flipbooks with realistic page-turn animations. This post covers the three hardest technical problems I solved along the way:

Page-flip physics in FlipFlow that actually feel real
Multi-format document pipeline (PDF → PPT → DOCX → image → unified page sequence)
AI generation via API to create illustrated flipbooks from a text prompt

If you're building anything involving document rendering, canvas animation, or AI-generated content pipelines, there's something here for you.

🔗 Live demo: flippingbooks.org/share/d8ca4875-919b-4d4d-ada9-1b08699006e8

The Problem I Was Actually Solving

Every week, millions of people export beautiful documents — product catalogs, brand lookbooks, digital magazines, course materials — as flat PDFs.

PDFs are read-only, static, and dead on arrival as a sharing format.

I wanted to build a free online flipbook maker that could:

Accept any common document format
Render it as a real, page-turning interactive publication
Add an AI layer so users could generate flipbooks without uploading anything

What I didn't expect was how hard the rendering layer would be.

Part 1: Building the Page-Flip Physics Engine

The core interaction — a page that folds, curves, and turns with realistic shadow and perspective — is harder than it looks.

The Math Behind the Curl

A page flip isn't a simple CSS transform. A real page curves as it turns. The visual deformation is a Bézier curve projection applied to a rectangular canvas layer.

Here's the simplified version of the core geometry:

function getFlipProgress(mouseX, bookCenterX, pageWidth) {
  // Normalize mouse position to -1 (left edge) → +1 (right edge)
  return Math.max(-1, Math.min(1, (mouseX - bookCenterX) / pageWidth));
}

function drawFlippingPage(ctx, progress, pageImageData) {
  const foldX = bookCenterX + progress * pageWidth;
  const foldAngle = progress * Math.PI;

  // Control points for the page curl Bézier
  const cp1x = foldX - pageWidth * 0.3;
  const cp1y = pageHeight * 0.1;
  const cp2x = foldX;
  const cp2y = pageHeight * 0.5;

  ctx.save();
  ctx.beginPath();
  ctx.moveTo(bookCenterX, 0);
  ctx.bezierCurveTo(cp1x, cp1y, cp2x, cp2y, foldX, pageHeight);
  ctx.clip();

  // Apply perspective distortion
  ctx.transform(
    Math.cos(foldAngle), 0,
    Math.sin(foldAngle) * 0.3, 1,
    foldX, 0
  );

  ctx.drawImage(pageImageData, 0, 0);
  ctx.restore();
}

The key insight is that the fold axis doesn't stay at a fixed X — it follows the cursor horizontally, and the page "behind" the fold uses a mirrored, slightly desaturated copy of the same image to simulate the back face of the paper.

The Shadow Problem

Realistic shadows are what separates a convincing page flip from a fake-looking one.

I ended up using a radial gradient shadow pinned to the fold axis that intensifies as the page reaches mid-flip (progress ≈ 0), and fades out as the page completes its turn:

function drawPageShadow(ctx, foldX, progress) {
  const shadowIntensity = 1 - Math.abs(progress); // max at center
  const gradient = ctx.createLinearGradient(
    foldX - 60, 0,
    foldX + 60, 0
  );
  gradient.addColorStop(0, `rgba(0,0,0,0)`);
  gradient.addColorStop(0.5, `rgba(0,0,0,${shadowIntensity * 0.35})`);
  gradient.addColorStop(1, `rgba(0,0,0,0)`);

  ctx.fillStyle = gradient;
  ctx.fillRect(foldX - 60, 0, 120, pageHeight);
}

Performance: `requestAnimationFrame` vs `setInterval`

Early versions used setInterval for the animation loop. This is a mistake — setInterval doesn't synchronize with the browser's paint cycle. The fix is requestAnimationFrame, which gives you:

Frame rates locked to the display refresh (typically 60fps)
Automatic pausing when the tab is backgrounded
Better GPU compositing

function animationLoop() {
  if (isFlipping) {
    ctx.clearRect(0, 0, canvas.width, canvas.height);
    drawStaticPages();
    drawFlippingPage(ctx, currentProgress, activePageImage);
    drawPageShadow(ctx, foldX, currentProgress);

    // Ease toward target
    currentProgress += (targetProgress - currentProgress) * 0.12;
  }
  requestAnimationFrame(animationLoop);
}
animationLoop();

The 0.12 easing factor is what creates the "physical deceleration" feel. Too high and the flip feels mechanical; too low and it feels laggy.

Mobile Touch Support

Getting drag gestures right on mobile required handling touchstart, touchmove, and touchend separately — and adding velocity tracking so a quick swipe completes the flip even if the finger lifts early:

let touchStartX = 0;
let velocity = 0;
let lastX = 0;

canvas.addEventListener('touchstart', e => {
  touchStartX = e.touches[0].clientX;
  lastX = touchStartX;
  velocity = 0;
});

canvas.addEventListener('touchmove', e => {
  const x = e.touches[0].clientX;
  velocity = x - lastX;      // track per-frame velocity
  lastX = x;
  updateFlipProgress(x);
  e.preventDefault();
}, { passive: false });

canvas.addEventListener('touchend', () => {
  // If swipe velocity > threshold, complete the flip
  if (Math.abs(velocity) > 8) {
    targetProgress = velocity > 0 ? 1 : -1;
  } else {
    // Snap back or forward based on current position
    targetProgress = currentProgress > 0 ? 1 : -1;
  }
});

Part 2: The Multi-Format Document Pipeline

Supporting PDF, PPTX, DOCX, JPG, and PNG as inputs means five completely different parsing pipelines. This was the most tedious engineering problem in the whole project.

The Architecture Decision: Normalize to Page Images First

The key architectural decision was to normalize everything to a sequence of page images before the flipbook renderer ever touches the content.

Input formats:
  PDF  ──┐
  PPTX ──┤  → Preprocessing Layer → [ page_001.png, page_002.png, ... ] → Flipbook Renderer
  DOCX ──┤
  JPG  ──┘
  PNG  ──┘

This means the renderer only ever deals with images. It doesn't know or care what the source format was. Adding a new input format later only requires writing a new preprocessor — the renderer stays unchanged.

PDF Rendering: `pdf.js` + Canvas

For PDF files, I used Mozilla's pdf.js library to render each page to a canvas at a target resolution:

async function pdfToPageImages(pdfUrl, targetDPI = 150) {
  const pdf = await pdfjsLib.getDocument(pdfUrl).promise;
  const pages = [];
  const scale = targetDPI / 96; // 96 is default screen DPI

  for (let i = 1; i <= pdf.numPages; i++) {
    const page = await pdf.getPage(i);
    const viewport = page.getViewport({ scale });
    const canvas = document.createElement('canvas');
    canvas.width = viewport.width;
    canvas.height = viewport.height;

    await page.render({
      canvasContext: canvas.getContext('2d'),
      viewport
    }).promise;

    pages.push(canvas.toDataURL('image/jpeg', 0.92));
  }
  return pages;
}

Key lesson: 150 DPI is the sweet spot. 96 DPI looks blurry on retina screens; 300 DPI produces files that are too large to stream quickly. At 150 DPI, a 20-page PDF typically generates around 8–12MB of image data — fast enough to stream page-by-page as the user reads.

PPTX / DOCX: Server-Side Conversion

Browser-side conversion of Office formats is unreliable. For PPTX and DOCX, I moved the conversion server-side using LibreOffice in headless mode:

# Convert PPTX to PDF first, then use pdf.js for page images
libreoffice --headless --convert-to pdf input.pptx --outdir /tmp/converted/

This gives consistent output across all Office document variants (older .ppt, .doc formats included).

Image Inputs: Aspect Ratio Normalization

When users upload a batch of JPG/PNG images, they often have inconsistent dimensions. Before building the flipbook, I normalize all images to a consistent aspect ratio with white padding:

function normalizeImageAspect(img, targetRatio = 1.414) { // A4 ratio
  const canvas = document.createElement('canvas');
  const imgRatio = img.width / img.height;

  if (imgRatio > targetRatio) {
    canvas.width = img.width;
    canvas.height = Math.round(img.width / targetRatio);
  } else {
    canvas.height = img.height;
    canvas.width = Math.round(img.height * targetRatio);
  }

  const ctx = canvas.getContext('2d');
  ctx.fillStyle = '#ffffff';
  ctx.fillRect(0, 0, canvas.width, canvas.height);

  // Center the image
  const offsetX = (canvas.width - img.width) / 2;
  const offsetY = (canvas.height - img.height) / 2;
  ctx.drawImage(img, offsetX, offsetY);

  return canvas.toDataURL('image/jpeg', 0.92);
}

Part 3: Wiring in the AI Generation Layer

The AI storybook feature — where a user types a prompt and gets a complete illustrated flipbook — was the most surprising part of the product to build.

The Generation Pipeline

User prompt
    ↓
Text generation (chapter/scene breakdown)
    ↓
Per-scene image generation
    ↓
Layout: text + image → page canvas
    ↓
Flipbook renderer

Prompt → Story Structure

I use a structured JSON output from the text model to define the page layout before any images are generated:

const storyPrompt = `
You are a children's book author. Given the following prompt, return a JSON array
of exactly 8 pages. Each page has:
  - "text": 1-2 sentences of story text (max 25 words)
  - "scene": a detailed image generation prompt for this page's illustration
  - "mood": one of ["wonder", "adventure", "calm", "excitement", "mystery"]

Prompt: "${userPrompt}"

Return ONLY valid JSON, no markdown, no preamble.
`;

const response = await fetch('/api/generate-story', {
  method: 'POST',
  body: JSON.stringify({ prompt: storyPrompt })
});
const pages = await response.json();

Scene → Image Generation

Each scene description gets passed to an image generation API. The key implementation detail is parallelizing the image generation calls with a concurrency limit — generating all 8 images sequentially would be too slow:

async function generatePageImages(pages, concurrency = 3) {
  const results = new Array(pages.length);

  // Process in chunks of `concurrency`
  for (let i = 0; i < pages.length; i += concurrency) {
    const chunk = pages.slice(i, i + concurrency);
    const chunkResults = await Promise.all(
      chunk.map((page, idx) =>
        generateImage(page.scene, page.mood)
          .then(imgUrl => ({ index: i + idx, url: imgUrl }))
      )
    );
    chunkResults.forEach(({ index, url }) => {
      results[index] = url;
    });
    // Update progress indicator
    onProgress(Math.round((i + concurrency) / pages.length * 100));
  }

  return results;
}

Why concurrency = 3? Most image generation APIs have rate limits. 3 parallel requests is fast enough to keep the UX responsive (total generation time ~15–20 seconds for 8 pages) while staying well within rate limits.

Compositing Text + Image onto a Page Canvas

Once you have the text and the generated image, you need to composite them onto a consistent page layout:

function compositePageCanvas(text, imageUrl, style = 'storybook') {
  const canvas = document.createElement('canvas');
  canvas.width = 800;
  canvas.height = 1130; // A4-ish

  const ctx = canvas.getContext('2d');
  const img = new Image();
  img.src = imageUrl;

  return new Promise(resolve => {
    img.onload = () => {
      // Background
      ctx.fillStyle = '#fffef5';
      ctx.fillRect(0, 0, canvas.width, canvas.height);

      // Illustration (top 70% of page)
      ctx.drawImage(img, 40, 40, 720, 760);

      // Text area (bottom 30%)
      ctx.fillStyle = '#f5f0e8';
      ctx.fillRect(0, 830, 800, 300);

      // Render text with word wrap
      ctx.fillStyle = '#2c1810';
      ctx.font = 'bold 28px Georgia, serif';
      ctx.textAlign = 'center';
      wrapText(ctx, text, 400, 890, 700, 36);

      resolve(canvas.toDataURL('image/jpeg', 0.92));
    };
  });
}

The SEO Architecture That Actually Drives Users to the Tool

Here's something I didn't fully appreciate until three months after launch: the content architecture around your tool matters as much as the tool itself.

Search engines crawl and index your entire content footprint — not just your landing page. Google evaluates topical authority, link depth, and whether your pages answer the real questions users are searching for.

For FlipFlow, the high-intent queries that drive signups are:

Query	Intent
`convert PDF to flipbook online free`	High conversion — user is ready to use a tool
`online flipbook maker`	Navigational — looking for the category
`free digital flipbook creator`	Price-sensitive, high-volume
`page flip effect online`	Feature-seeking
`digital magazine maker`	Vertical use case
`HTML5 flipbook maker`	Technical audience (you, probably)

Each of these deserves its own dedicated page with matching content. A single landing page trying to rank for all of them won't outcompete domain-specific pages that answer each query precisely.

Lessons for Other Indie Devs

After building FlipFlow from scratch as a solo developer, here's what I'd tell myself 12 months ago:

1. Solve the core interaction in complete isolation first.
Don't wire up auth, billing, or the database until the page-flip physics feel right. The core UX is your product. Everything else is infrastructure.

2. Canvas content is invisible to search engines.
Anything rendered in <canvas> doesn't get indexed. Put your text content in the DOM and use canvas only for the visual layer. This matters a lot for SEO.

3. Normalize your data model early.
The "normalize everything to page images" decision was the best architectural call I made. It decoupled the rendering layer from the ingestion layer completely.

4. AI generation shifts your user profile in ways you can't predict.
When I added the AI storybook generator, the user base shifted overnight from designers and marketers toward authors, teachers, and content creators. Build for your initial audience, but don't be surprised when the AI features open up entirely new ones.

5. A genuine free tier is the best distribution mechanism.
Every flipbook a free user creates and shares is a live demo of your product. It's a backlink, a marketing asset, and a proof of concept — all at once.

Try It Yourself

If you made it to the end of this post, you clearly care about browser rendering, document conversion, or AI-powered content tools — which means FlipFlow was probably built for someone like you.

Upload a PDF, convert a presentation, or generate an AI storybook from scratch:

👉 flippingbooks.org — free, no account required to try

Or flip through a live demo right now:

↑ This flipbook was generated with FlipFlow. Click to experience various page-turning effects firsthand. Its powerful functions will surprise you.

Have questions about the Canvas rendering, the document pipeline, or the AI generation layer? Drop a comment — I read every single one and usually reply within a day.

Tags: #indiehacking #javascript #webdev #ai #buildInPublic #seo #pdf #canvas #saas

Keywords: online flipbook maker, convert PDF to flipbook, free digital flipbook creator, HTML5 flipbook maker, page flip effect online, digital magazine maker, FlipFlow, PDF to flipbook online free, interactive digital publication, AI storybook generator, support PDF PPT PPTX DOC DOCX JPG PNG convert

Top comments (6)

Nazar Boyko • Jun 29

That easing line, currentProgress += (target - current) * 0.12, has a subtle catch worth flagging. Since it runs once per frame, the flip eases about twice as fast on a 120Hz display as on a 60Hz one, so the "physical deceleration" feel you carefully tuned shifts with whatever monitor the reader has. Scaling the step by delta time instead of a fixed 0.12 keeps the motion identical everywhere. Tiny thing, but the flip feel is basically the whole product, so it's the sort of inconsistency people sense without being able to name it. Normalizing everything to page images first is the call I'd happily borrow for projects that have nothing to do with flipbooks.