Yuuki Yamashita

Posted on Jun 2

Generating animated LINE-style chat slides with python-pptx + raw XML (and shipping it on Vercel)

#showdev #automation #python #productivity

I was putting together a talk and had one of those half-baked ideas that you can't shake off: what if I showed an iPhone with a LINE chat screen, and the messages popped in one by one, like a real conversation happening live? The problem is, building that by hand in PowerPoint sounds miserable — laying out every bubble, then setting up an entrance animation for each one, one at a time. So instead I built a little tool that takes a chunk of conversation text and spits out a .pptx, and then I turned it into a web app and put it on Vercel.

This post focuses on the three things that actually gave me trouble:

Building an iPhone / LINE-style UI out of nothing but shapes in python-pptx
Working around the fact that python-pptx has no animation support — by hand-writing the timing XML and injecting it into the slide
Wiring up "conversation in, pptx out" with a static HTML front end and a Vercel serverless function (including the deploy that bit me)

Here's what came out of it:

And the slides it generates look like this:

The overall shape of it

It's nothing fancy. This is the whole file list:

.
├─ index.html        front end (input form + live preview)
├─ api/generate.py   conversation JSON → pptx (build_pptx) + handler
├─ requirements.txt  python-pptx
├─ vercel.json       builds (static + @vercel/python)
└─ dev_server.py     for local testing

All the real generation logic lives in Python (python-pptx), and the front end is plain HTML/CSS/JS. On Vercel, api/generate.py runs as a serverless function and index.html is served as a static file.

Building the iPhone / LINE UI out of shapes

Set the slide to a tall 9:16 aspect ratio, and from there it's just a matter of stacking rounded rectangles, ellipses, and text boxes.

SLIDE_W = Inches(7.5)
SLIDE_H = Inches(13.333)   # 7.5 : 13.333 = 9 : 16
prs.slide_width  = SLIDE_W
prs.slide_height = SLIDE_H

The stacking order goes: phone frame (black rounded rect) → screen (chat background) → green header → Dynamic Island → status bar (clock, signal, battery) → input bar. For something like the Dynamic Island, a black rectangle with its corner roundness (adjustment) cranked up to 0.5 already reads as the real thing.

isl = slide.shapes.add_shape(MSO_SHAPE.ROUNDED_RECTANGLE, ...)
isl.adjustments[0] = 0.5   # fully rounded into a "pill" shape

For the header and the input bar, I used ROUND_2_SAME_RECTANGLE ("rounded on the top two corners only" / "bottom two only") so they don't fight with the rounded corners of the screen. The input bar is just rotated 180° so its rounded side faces down.

One thing that quietly mattered: Japanese fonts. If you only set a:latin, Japanese text can fall back to some other font, so I push the same typeface into a:ea (East Asian) as well. That kept things consistent.

rPr = run._r.get_or_add_rPr()
for tag in ("a:latin", "a:ea", "a:cs"):
    el = rPr.find(qn(tag)) or rPr.makeelement(qn(tag), {})
    el.set("typeface", "Hiragino Sans")

Sizing the bubbles by guesswork

Each bubble gets a width and height estimated from its text, then placed. python-pptx can't measure rendered text, so I just approximate the display width as full-width characters = 1.0 and half-width = 0.5: short lines get a one-line width, longer ones wrap at a maximum width. Pretty crude, but it works.

def measure_bubble(text):
    char_w = BUBBLE_FONT_PT / 72.0           # one full-width char ≈ 0.208in
    cap = int((MAX_BUBBLE_W - PAD*2) / (char_w * 1.06))  # full-width chars per line
    ...
    if longest <= cap and "\n" not in text:
        width = longest * char_w + PAD*2 + 0.18   # slack so it doesn't wrap
        lines = 1
    else:
        width = MAX_BUBBLE_W
        lines = ceil(longest / cap)
    height = lines * LINE_H + PAD_Y*2 + 0.06
    return width, height, lines

At first I didn't add the slack (that trailing + 0.18) and used the exact width. The result: a short phrase like おつかれさま！ ("hey, nice work today!") would wrap onto two lines and look weirdly stretched. The actual rendered character width comes out a bit wider than my estimate, so giving it a little breathing room settled things down.

For the arrangement I copied real LINE and made "bottom-aligned" the default (newest message sits right above the input bar). I just sum up the height of the whole conversation first and set the start position to CHAT_BOTTOM - block_h.

The main event: no animation API, so write the XML yourself

This is where I got stuck the longest. python-pptx can create shapes and text just fine, but it gives you no way to touch animations (entrance effects and the like) through its API. So what do you do? Animations are stored as an OOXML element called <p:timing>, so I build that element myself and append it to the end of an already-generated slide.

PowerPoint animations are a tree of time nodes. Under mainSeq (the sequence that advances on click), you hang a <p:par> for each effect. What I wanted was "on click, bubbles appear one after another at 0.8s intervals," so:

The first effect is nodeType="clickEffect" (fires on click)
Every effect after that is nodeType="withEffect" with a delay, staggered in time

There's also an "After Previous" option, but it works by waiting for the previous effect to finish, and the behavior felt flaky. So I leaned on absolute offsets of i × 800ms instead. That turned out to be both more predictable and, honestly, less work.

The effect on each bubble is "Float In" — it fades in while drifting up slightly from below. I use <p:set> to make it visible, <p:anim> to move ppt_y (vertical position) from a touch below up to its real spot, and <p:animEffect filter="fade"> to fade it in, all running together.

def _effect_par(eid, spid, delay, node_type, grp):
    return f'''<p:par ...>
  <p:cTn id="{eid}" presetID="42" presetClass="entr" presetSubtype="8"
         fill="hold" grpId="{grp}" nodeType="{node_type}">
    <p:stCondLst><p:cond delay="{delay}"/></p:stCondLst>
    <p:childTnLst>
      <p:set> ... style.visibility=visible ... </p:set>
      <p:anim calcmode="lin" valueType="num">
        <p:cBhvr additive="base">
          <p:cTn id="{eid+2}" dur="{EFFECT_MS}" fill="hold"/>
          <p:tgtEl><p:spTgt spid="{spid}"/></p:tgtEl>
          <p:attrNameLst><p:attrName>ppt_y</p:attrName></p:attrNameLst>
        </p:cBhvr>
        <p:tavLst>
          <p:tav tm="0"><p:val><p:strVal val="ppt_y+0.04"/></p:val></p:tav>
          <p:tav tm="100000"><p:val><p:strVal val="ppt_y"/></p:val></p:tav>
        </p:tavLst>
      </p:anim>
      <p:animEffect transition="in" filter="fade"> ... </p:animEffect>
    </p:childTnLst>
  </p:cTn>
</p:par>'''

spid is shape.shape_id. For an incoming message I wanted the bubble and the avatar to appear together, so I give them the same grpId and the same delay. Once the <p:timing> string is assembled, I parse it with lxml and just append it to the slide.

timing = etree.fromstring(timing_xml.encode("utf-8"))
slide._element.append(timing)

A nice bonus: once a shape has an entrance effect, PowerPoint handles "keep it hidden until it fires" on its own. So I didn't have to write any code to hide the initial state myself.

One more note on checking my work. I was eyeballing layout by exporting to PDF with LibreOffice — but LibreOffice's PDF export only renders the final state of an animation. It's perfectly fine for catching layout problems, but to verify the motion itself I had to fall back to running an actual slideshow in PowerPoint/Keynote.

Turning it into a web app

Because I'd already factored the generation into a single build_pptx(data) -> bytes function, the web side only needed to POST the conversation as JSON. I wrote the serverless function with the standard-library BaseHTTPRequestHandler and return the bytes from build_pptx directly.

class handler(BaseHTTPRequestHandler):
    def do_POST(self):
        data = json.loads(self.rfile.read(length) or b"{}")
        pptx = build_pptx(data)
        self.send_response(200)
        self.send_header("Content-Type",
            "application/vnd.openxmlformats-officedocument.presentationml.presentation")
        self.send_header("Content-Disposition", 'attachment; filename="line_chat.pptx"')
        self.end_headers()
        self.wfile.write(pptx)

Input is just one text area

I didn't get clever with the input UI — it's a single text area. The only rules are "L: is the other person, R: is you, a blank line starts a new slide." Keeping it this loose is actually what makes it pleasant: you just dump in whatever conversation pops into your head.

L: Hey, nice work today!
L: There's something I want to run by you
R: What's up?

L: This thing's been eating up my time lately
R: Oh, I totally get that

Parsing is a regex applied line by line. Any line without an L:/R: prefix gets joined onto the previous bubble with a newline (i.e. multi-line bubbles).

const m = /^([LRlr])[:：]\s?(.*)$/.exec(raw);
if (m) { last = {side: m[1].toUpperCase(), text: m[2]}; slide.push(last); }
else if (last) { last.text += "\n" + raw.trim(); }   // multi-line bubble

A live preview in the browser

Having to download the file just to see the result is a miserable loop, so I rebuilt the same look in HTML/CSS and added a preview-and-play feature right in the browser. The colors and the bottom-alignment match the generated pptx, and for playback I just add CSS transitions (opacity and translateY) one after another with setTimeout. It ends up feeling about the same as the animation inside the pptx.

Two things that bit me on deploy

A requirements.txt makes Vercel think it's a "Python app"

My very first vercel --prod died immediately with this:

The pattern "api/generate.py" defined in `functions`
doesn't match any Serverless Functions inside the `api` directory.

The cause was the requirements.txt sitting at the project root. With that present, Vercel decides the whole project is a "Python app" and stops picking up api/generate.py as an individual function. I fixed it by spelling out builds in vercel.json, telling Vercel to build the static HTML and the Python function separately.

{
  "version": 2,
  "builds": [
    { "src": "index.html",      "use": "@vercel/static" },
    { "src": "api/generate.py", "use": "@vercel/python" }
  ],
  "routes": [
    { "src": "/api/generate", "dest": "/api/generate.py" },
    { "src": "/",             "dest": "/index.html" }
  ]
}

After that, python-pptx installed itself from requirements.txt, and both the function and the static serving worked exactly as I'd hoped.

You can't protect production on the free plan

Once it was live, I wanted to put it behind some access control, so I tried enabling ssoProtection (Vercel Authentication) via Vercel's REST API. This came back:

{ "code": "invalid_sso_protection",
  "message": "Vercel Authentication is not available on your plan for production deployments" }

Turns out the Hobby (free) plan doesn't let you put Vercel Authentication / Password Protection on production deployments (you need Pro). Preview deployments can be protected on the free plan. If you want to keep production hidden while staying free, you're looking at rolling your own gate — Edge Middleware, or Basic auth inside the Python function. For now I left it open.

Wrapping up

After doing all this, the part I'm most glad I pushed through was writing the animation XML by hand. I'd nearly written off animations as impossible with python-pptx, but it turns out that hand-assembling a <p:timing> element and appending it is all it takes to get bubbles floating in, one by one, on a real PowerPoint. And the dead-simple withEffect + absolute delay approach was more than enough.

Factoring the generation into a single function also paid off more than I expected — moving from a CLI to a web app was basically copy-paste.

Being able to type out a conversation and get an animated chat slide back made prepping for talks noticeably less of a chore. The code is up on GitHub if you want to poke around — happy to hear what you'd build with it.

Top comments (1)

Echo • Jun 2

Great write-up — the python-pptx timing-XML workaround is the kind of trick worth saving. Three things from the "ship it on Vercel" half that I want to push on:

1) "Hand-write the timing XML and inject it" is right for the artifact, but the moment you change the count of bubbles, you have to recompute the timing offsets. The "happy path" is to generate both the slide and the timing in a single pass from the same source (the conversation text), so the offsets can never drift from the bubbles. If the offset-generation lives in a second file you'll get the classic "the animation is half a second off" bug a year from now.

2) Vercel serverless for "text in, .pptx out" has a quiet failure mode: the response is fully buffered, so a long conversation (~500 messages) can hit the function's response size limit before the user sees a download. Worth chunking the response (stream the .pptx) or capping the input length with a friendly error, not a 500. The article doesn't mention this — but the first user with a 200-message script will hit it.

3) "Conversation in, artifact out" is a much bigger category than chat slides. The cleanest way to extend this is to make the bubble-renderer pluggable (LINE → SMS → Discord → iMessage) and keep the timing + layout engine one level up. That way the next showdev post is a 30-line renderer swap, not another full rewrite.

The choice to lean on raw XML instead of "find a library that supports animations" is the right call. Animation libraries will rot faster than the Office XML.