Sugam Panthi

Posted on May 27

Part 2: Replacing a 3.4MB video with 40kb of scripted GSAP animations: adding a camera

#webdev #design #animation #claude

Part one reached 69,000 views on r/webdev in two days. I did not expect that. I wrote it because I thought replacing a 3.4 MB video with 40 KB of DOM animation was interesting enough to share. Turns out a lot of people are thinking about the same problem.

The comments were better than the post. People asked about SEO indexing, screen reader behavior, prefers-reduced-motion fallbacks, whether GSAP is even necessary, and several pointed out it was missing something. They were right.

Part one had a cursor clicking through scenes on a flat stage. Everything happened at the same zoom level, the same distance from the viewer. Watch a real product demo video and you notice something different: the camera moves. It zooms into the action when something important happens, follows the cursor through a workflow, and pulls back when the scene changes. That is the difference between a slide deck and a directed film.

This post adds the camera. Same 40 KB budget. No video files.

See the full post with 4 interactive demos, side-by-side comparisons, and a live FPS stress test

The zoom wrapper problem

The first thing you try is gsap.to(frame, { scale: 1.4 }). It works for half a second, then you notice your responsive scaling is gone. The film frame uses transform: scale(filmScale) to fit its container width, and GSAP just overwrote it. Both write to the same CSS transform property. They cannot coexist on the same element.

The fix is a wrapper inside the frame that GSAP owns:

<div data-film-frame style={{ transform: `scale(${filmScale})` }}>
  <div data-film-zoom className="origin-center">
    {/* All scene content */}
  </div>
  <Cursor />  {/* Outside the zoom wrapper */}
</div>

GSAP animates data-film-zoom. The responsive scale on the outer frame is untouched. The cursor lives outside the zoom wrapper, so it stays the same size while content scales up around it. This took me an embarrassing amount of time to figure out. Two divs. That was the whole fix.

Pan math

Zooming into the center of the frame is easy. Zooming into a specific element, like a Pinterest pin the user is about to right-click, is the real problem. You need to translate the zoom wrapper so the target ends up centered in the visible viewport:

const ZOOM = 1.45;
const panTo = (target) => ({
  x: (FILM_WIDTH / 2 - target.x) * (ZOOM - 1),
  y: (FILM_HEIGHT / 2 - target.y) * (ZOOM - 1),
});

At zoom 1.0, no translation needed. At zoom 1.45, every pixel away from center needs 0.45 times that distance in translation to compensate.

Stay zoomed, pan to follow

The biggest mistake in my first attempt was zooming in for one interaction, zooming out, zooming in for the next, zooming out again. It looked like a PowerPoint with a budget zoom transition.

The correct pattern: zoom in once, stay zoomed, pan to follow the cursor through the whole interaction sequence, zoom out once when changing scenes.

// Zoom in on right-click
tl.to(zoom, {
  scale: ZOOM, x: panPin.x, y: panPin.y,
  duration: 0.65, ease: "expo.out",
}, "zoomIn");

// Context menu while zoomed
tl.to(ctx, { autoAlpha: 1, duration: 0.16 }, "zoomIn+=0.35");

// Camera pans to follow cursor to the save button (still zoomed)
tl.to(zoom, {
  x: panSave.x, y: panSave.y,
  duration: 0.55, ease: "sine.inOut",
}, "panToSave");

// Only zoom out when done
tl.to(zoom, {
  scale: 1, x: 0, y: 0,
  duration: 0.45, ease: "sine.inOut",
}, "zoomOut");

See this running as a live interactive demo in the full post

Easing is not one-size-fits-all

Part one used power2.out for almost everything. Fine for a flat demo. But when you add camera movement, different motion types need different eases or the whole thing feels off.

Motion	Ease	Why
Camera pan	`sine.inOut`	Smoothest curve. Feels like a real camera on a dolly.
Dramatic zoom-in	`expo.out`	Fast start, long deceleration. Whoosh then settle.
Cursor movement	`sine.inOut`	Natural hand acceleration.
Fade in	`sine.out`	Decelerates into visibility.
Click squeeze	`power2.out`	Snappy press response.
Click release	`back.out(2.2)`	Slight overshoot. Feels physical.
Typed text	`none`	Real typing does not ease.

The single biggest upgrade was replacing power2.inOut with sine.inOut on camera moves. power2 has a noticeable acceleration kick that feels mechanical at large scale. sine has no perceptible kick at any point in the curve.

The cursor is never dead

With camera movement, there are moments where the scene shifts but the cursor is frozen. Those moments break the illusion. Every camera move should have a simultaneous cursor movement at the same timeline label:

// BAD: cursor is dead during zoom
tl.to(zoom, { scale: 1.4, x: panX, y: panY, duration: 0.7 }, "zoomIn");

// GOOD: cursor drifts toward its next target during zoom
tl.to(zoom, {
  scale: 1.4, x: panX, y: panY,
  duration: 0.7, ease: "expo.out",
}, "zoomIn");
tl.to(cursor, {
  x: current.x + (next.x - current.x) * 0.3,
  y: current.y + (next.y - current.y) * 0.3,
  duration: 0.65, ease: "sine.out",
}, "zoomIn");

The cursor does not need to arrive. A 20-30% drift toward the next target is enough. Calm, but never dead.

The full post has a side-by-side comparison of frozen vs drifting cursor during zoom

Graceful loops

Part one's loop was abrupt. With camera movement the problem is worse because the zoom jumps from 1.4x to 1.0 instantaneously.

The fix is an outro:

tl.to(zoom, { scale: 1, x: 0, y: 0, duration: 0.8, ease: "sine.inOut" }, "outro");
tl.to(cursor, { x: offScreenX, y: offScreenY, duration: 0.7, ease: "sine.in" }, "outro+=0.1");
tl.to({}, { duration: 0.6 }); // breathing room

And a timing rule: first drafts are always too slow. Cut 20-30% from your timing after the first working version.

SEO, theming, and the stuff video cannot do

View source on a <video> tag and Google sees <video src="demo.mp4">. A black box. View source on a GSAP demo and every button label, menu item, and heading is indexable text. I have not run a controlled ranking test, but logically: real DOM text beats an opaque binary blob.

Beyond indexing, DOM demos change when the product does. Marketing changes a button label? Change a text string. Brand color update? One hex value. You want the demo buttons to actually convert? Wire them to a real signup modal. A video cannot do any of that.

Accessibility

When prefers-reduced-motion: reduce is enabled, skip the entire timeline and render the resting state. GSAP's autoAlpha sets both opacity: 0 and visibility: hidden, removing elements from tab order and screen readers. Every hidden element uses autoAlpha, not opacity, so screen readers only see what is currently visible.

Does it need GSAP?

For simple animations, no. But the web clipper demo has roughly 50 coordinated animations across two scenes with camera movement. GSAP gives you labels (named beats), repeat: -1 with reset logic, gsap.utils.selector(root) for data-attribute queries, a full easing library, and stagger. You could build all of this without GSAP. It would take longer and produce worse results.

GSAP core is 28 KB gzipped. If you are orchestrating a 50-animation cinematic demo, it is infrastructure.

The agent reality

Five demos across the site, roughly 4,000 lines of choreography code. I wrote maybe 60% of that. An AI agent wrote the other 40%.

My 60% was the directing: deciding what to zoom into, choosing eases after watching both options, noticing the cursor felt dead during zooms, saying "the zoom-out is 200ms too slow." The agent cannot watch the animation and feel that something is off.

The agent's 40% was the implementation: 1,500 lines of timeline code, DOM elements, click helpers, reset blocks. Pattern-based work following rules I defined in a skill file. The patterns are stable enough that the output is consistent across different demos. But the agent does not know when something looks wrong. That part is still mine.

Performance

The web clipper demo has 50 .to() calls but only 3-5 are actively tweening at any frame. Every .to() writes to transform or opacity, both compositor-friendly. The full post has a live stress test with an FPS counter where you can push from 8 to 48 concurrent elements and watch the frame rate.

The updated math

Approach	Compressed	Camera?	Indexable?	Accessible?	Themeable?
GSAP (flat, part one)	~37 KB	No	Yes	Yes	Yes
GSAP (pan-zoom, part two)	~40 KB	Yes	Yes	Yes	Yes
15s MP4 (H.264)	2-4 MB	Baked in	No	No	No
15s WebM (VP9)	1-2 MB	Baked in	No	No	No
15s GIF	8-15 MB	Baked in	No	No	No

Adding the camera added roughly 3 KB to the bundle. All five production demos across costumary.com ship under 80 KB gzipped combined. The equivalent in video would be 15-20 MB.

Part one was proof you could replace a video. Part two is the part where it starts to feel directed instead of recorded.

The upfront cost is real. The ongoing cost is near zero. Updates are code changes, not re-recordings.

Full post with interactive demos: spanthi.com/blog/gsap-choreography-part-2

Production examples: costumary.com | costumary.com/web-clipper

Open source agent skill: gsap-choreography