Telegram video avatars have a strict spec: 800x800 px, H.264 codec, max 10 seconds, max 2MB, no audio track. Most Android phones shoot H.264 already, so they just work. iPhones default to HEVC (H.265) since iOS 11. When you upload an iPhone clip to set as your video avatar, Telegram shows a spinner, then nothing. No error message. The video just doesn't set. You try again, same thing.
That silent failure sent me down a rabbit hole building @LiveAvaBot.
What the Telegram Spec Actually Requires
The video profile spec (from Telegram's API docs, confirmed by trial and error):
- Codec: H.264 (libx264), yuv420p pixel format
- Resolution: 800x800 px, square
- Duration: 10 seconds max
- File size: 2MB max
- Audio: none, strip it entirely
HEVC violates the first requirement. Telegram's client doesn't decode HEVC for avatar playback, and it doesn't tell you why the upload failed.
How ffmpeg Solves It
The conversion pipeline I landed on:
-
Detect crop bounds. Most phone videos have letterbox or pillarbox black bars after rotation.
cropdetectfinds the actual content rectangle. - Crop and scale to 800x800. For non-square source video, one axis gets stretched. For an 800x800 looping avatar, nobody notices.
-
Re-encode to H.264 yuv420p.
libx264with CRF 28 keeps quality reasonable while hitting the 2MB ceiling. If the output is still over 2MB (high-motion clips), bump CRF to 32 and re-encode. -
Strip audio.
-anflag. - faststart. Moves the moov atom to the front so the file plays before it fully loads on mobile.
Two passes: first run cropdetect, parse the output, then encode with the detected crop filter.
# Pass 1: detect crop
ffmpeg -i input.mov -vf cropdetect=limit=24:round=2:reset=0 \
-f null - 2>&1 | grep "crop=" | tail -1
# Pass 2: encode (fill in crop values from pass 1)
ffmpeg -i input.mov \
-vf "crop=1080:1080:0:60,scale=800:800:force_original_aspect_ratio=disable,format=yuv420p" \
-c:v libx264 -crf 28 -preset fast \
-t 10 \
-an \
-movflags +faststart \
output.mp4
The -t 10 trims anything longer than 10 seconds from the start. A smarter "find the best 10-second window" feature is on the list, but trimming from the start covers 95% of use cases.
The aiogram 3 Handler
The bot accepts video files and GIFs. Here's the stripped-down handler:
from aiogram import Router, F
from aiogram.types import Message
import asyncio, tempfile, os
router = Router()
@router.message(F.video | F.animation | F.document)
async def handle_video(message: Message):
status = await message.answer("Converting...")
file_obj = message.video or message.animation or message.document
if file_obj is None:
return
with tempfile.TemporaryDirectory() as tmpdir:
src = os.path.join(tmpdir, "input")
dst = os.path.join(tmpdir, "output.mp4")
file_info = await message.bot.get_file(file_obj.file_id)
await message.bot.download_file(file_info.file_path, src)
ok, err = await asyncio.get_event_loop().run_in_executor(
None, convert_to_avatar, src, dst
)
if not ok:
await status.edit_text(f"Failed: {err}")
return
with open(dst, "rb") as f:
await message.answer_document(
f,
caption="Set as video avatar: Profile > Edit > Set Video"
)
await status.delete()
A few things worth noting:
-
answer_documentnotanswer_video: sending as a video triggers Telegram's transcoder, which re-encodes and breaks the spec. Document skips that. -
Run blocking ffmpeg in executor: ffmpeg is a subprocess call. Wrapping it in
run_in_executorkeeps the event loop free for concurrent users. -
Temp directory: cleans up automatically on
withblock exit, even if an exception fires.
The convert_to_avatar function is a plain Python function calling subprocess.run with the ffmpeg args above, returning (bool, str).
What I Shipped
I wrapped this in a production bot at https://t.me/LiveAvaBot?start=devto_article_20260605. It's been running since early 2026 and has 114 users. The stack:
- aiogram 3 for async handlers
- ffmpeg system binary (not a Python wrapper)
- SQLite for per-user state and rate limiting
- systemd for process management, no Docker, $6/month Hetzner box
The bot handles iPhone HEVC clips, Android MP4s, GIFs (Telegram sends these as animation), and forwarded videos from other chats. Files over 50MB are rejected before downloading.
Built by me. @liveavabot
Edge Cases and What I Learned
Vertical video from iPhone: after iOS applies rotation metadata, ffmpeg sees a portrait frame. The scale=800:800:force_original_aspect_ratio=disable stretches it to square. Intentional for avatars.
GIFs with transparency: Telegram's animation type is actually an MP4 under the hood. No special handling needed.
The 2MB ceiling is tight: a 10-second clip at 30fps with moderate motion will blow past 2MB at CRF 28. The CRF 32 fallback helps, but I've seen cases needing CRF 36. Quality is noticeably soft at that point, but it's a working avatar.
Telegram's upload limit is separate from the 2MB avatar limit: you can receive a 50MB video from a user, process it, and send back a 1.8MB output. The limits operate at different layers.
ffmpeg must be on PATH or specified by absolute path. On Ubuntu, apt install ffmpeg puts it at /usr/bin/ffmpeg. I hardcode the path in production to avoid surprises if someone runs the bot inside a venv with a different PATH.
What's next: a preview step that pulls just the first few seconds via partial ffmpeg read, so users can confirm the crop looks right before getting the final file. ffmpeg is doing the heavy lifting here; I just wrote the wrapper and wired it to Telegram.
Top comments (0)