DEV Community

liveavabot
liveavabot

Posted on

How to Make a Telegram Video Avatar That Actually Uploads

I spent a weekend last month trying to set a video as my Telegram profile picture from my iPhone. Nothing worked. The Telegram app would let me pick the file, show a tiny progress bar, then silently do nothing. No error, no toast, no log. Just the old static avatar staring back at me.

It turns out Telegram has a very narrow spec for video avatars, and that spec is not documented anywhere obvious. When your file breaks any of the rules, the upload gets rejected without telling you why. iPhone's default HEVC codec breaks the very first rule.

This is the story of how I worked it out and wrapped it in a bot.

What Telegram Actually Wants

After enough trial, error, and reading TDLib source, the real spec for a video avatar is:

  • Dimensions: exactly 800x800 pixels, square.
  • Codec: H.264 (libx264). No HEVC, no VP9, no AV1.
  • Pixel format: yuv420p.
  • Duration: at most 10.0 seconds.
  • File size: at most 2 MB.
  • Audio: no audio track at all.
  • Container: MP4 with the moov atom at the start (faststart).
  • Frame rate: anything reasonable, but 30 fps is safest.

Miss any one of these and the client refuses to upload. The error path is "do nothing". Great UX.

Why iPhone Breaks This By Default

Modern iPhones record in HEVC (H.265) to save space. The .mov files they produce are gorgeous, but Telegram won't touch them as an avatar. You also get:

  • A landscape or portrait aspect ratio, not square.
  • An audio track you didn't ask for.
  • A file size that's often well above 2 MB.
  • An MOV container, not MP4.

So before anything else, you need a transcode step that handles codec, container, crop, scale, audio strip, and faststart in one pass.

The ffmpeg Command That Solves It

Here is the command I landed on. It takes any input video and produces a Telegram-legal MP4.

ffmpeg -i input.mov \
  -t 10 \
  -vf "crop=min(iw\,ih):min(iw\,ih),scale=800:800,fps=30" \
  -c:v libx264 \
  -preset medium \
  -pix_fmt yuv420p \
  -b:v 900k \
  -an \
  -movflags +faststart \
  output.mp4
Enter fullscreen mode Exit fullscreen mode

A few notes on what each flag does:

  • -t 10 caps duration at 10 seconds. Anything longer gets truncated.
  • crop=min(iw,ih):min(iw,ih) center-crops the input to a square based on the smaller dimension. Vertical iPhone videos get the sides chopped, horizontal videos get the top and bottom chopped.
  • scale=800:800 resizes the square to exactly 800x800. Telegram is strict about this.
  • fps=30 normalizes frame rate. 60 fps clips work too, but 30 keeps bitrate predictable.
  • -c:v libx264 -preset medium -pix_fmt yuv420p gives you a Telegram-compatible H.264 stream. yuv420p is the only pixel format that plays everywhere.
  • -b:v 900k aims for around 900 kbit/s video bitrate. A 10-second clip lands near 1.1 MB, well under the 2 MB cap.
  • -an drops audio. Telegram refuses files with an audio track.
  • -movflags +faststart moves the moov atom to the front of the file so it streams instead of buffering.

For longer source videos I sometimes do a cropdetect pre-pass to find the actual content rectangle, but for avatars the center crop is usually fine.

Wrapping It in an aiogram 3 Bot

Once the ffmpeg side worked, I wrapped it as a Telegram bot so I could send any video and get a legal avatar file back. Here is the handler.

from aiogram import Router, F
from aiogram.types import Message, FSInputFile
import asyncio
import tempfile
from pathlib import Path

router = Router()

FFMPEG_CMD = [
    "ffmpeg", "-y", "-i", "{inp}",
    "-t", "10",
    "-vf", "crop=min(iw\\,ih):min(iw\\,ih),scale=800:800,fps=30",
    "-c:v", "libx264", "-preset", "medium",
    "-pix_fmt", "yuv420p", "-b:v", "900k",
    "-an", "-movflags", "+faststart",
    "{out}",
]

@router.message(F.video | F.animation | F.document)
async def handle_video(message: Message):
    file = message.video or message.animation or message.document
    if not file:
        return

    with tempfile.TemporaryDirectory() as td:
        inp = Path(td) / "in.bin"
        out = Path(td) / "out.mp4"
        await message.bot.download(file, destination=inp)

        cmd = [c.format(inp=str(inp), out=str(out)) for c in FFMPEG_CMD]
        proc = await asyncio.create_subprocess_exec(
            *cmd,
            stdout=asyncio.subprocess.DEVNULL,
            stderr=asyncio.subprocess.PIPE,
        )
        _, err = await proc.communicate()
        if proc.returncode != 0:
            await message.answer(f"ffmpeg failed: {err.decode()[-400:]}")
            return

        size_mb = out.stat().st_size / (1024 * 1024)
        if size_mb > 2.0:
            await message.answer(f"output is {size_mb:.2f} MB, over Telegram cap")
            return

        await message.answer_video(FSInputFile(out))
Enter fullscreen mode Exit fullscreen mode

A few details worth pointing out:

  • The handler accepts F.video | F.animation | F.document because Telegram routes the same MP4 as different types depending on how the client uploaded it. GIFs come in as animations. Files sent with "send as file" come in as documents.
  • I download into a TemporaryDirectory and let the context manager clean up. No leftover files between requests.
  • I check size_mb after encoding because the 900k bitrate target is an aim, not a guarantee. A high-motion source can overshoot.
  • Errors get returned to the user with the tail of ffmpeg's stderr. When a clip is broken the message is usually enough to know why.

Packaging It as @liveavabot

I rolled all of this into a small public bot so I would never have to think about it again. Send it any video or GIF from your phone, get back a Telegram-legal video avatar. iPhone HEVC, Android MP4, screen recordings, downloaded clips, all of them go through the same pipeline.

You can try it here: https://t.me/LiveAvaBot?start=devto_article_20260617

It runs on one small VPS, the queue is async, and average turnaround is two to three seconds for a typical 5-second iPhone clip.

What I Got Wrong The First Time

Some edge cases I hit during the build that might save you time:

  • Square crops on already-square sources. If the input is already 800x800, crop=min(iw,ih):min(iw,ih) still works, it just does nothing. Don't add a conditional, it's not needed.
  • Audio tracks in silent GIFs. GIFs uploaded as Telegram animations sometimes carry an empty audio track that's been re-encoded somewhere. Always pass -an, even when you think the input is silent.
  • MOV vs MP4. Telegram is happy to accept MOV as input from a user, but the upload payload back has to be MP4. Always use .mp4 as your output extension.
  • The faststart flag matters. Without -movflags +faststart, some Telegram clients (especially older Android builds) refuse to even preview the result. The moov atom needs to be at the front of the file.
  • 4K input is fine but slow. ffmpeg will downscale 4K to 800x800 happily, but a 10-second 4K clip takes about 8 seconds to encode on my VPS. Worth showing a "processing" message to the user while it runs.

That's the whole pipeline. ffmpeg does the heavy lifting, aiogram routes the events, the bot just glues them together.

Built by me, @liveavabot.

Top comments (0)