The Bug That Started This
Last month a friend tried to set a video avatar in Telegram. He recorded a 7-second clip on his iPhone, opened the avatar dialog, picked the video. Telegram showed a spinner, then nothing. No error, no toast. The avatar stayed as a static photo.
The video was HEVC. Telegram's video avatar slot only accepts H.264, but instead of telling you that, the client just silently refuses. I dug into the spec, wrote a small ffmpeg pipeline, and wrapped it in an aiogram bot. That's @liveavabot.
What Telegram Actually Wants
The video-note format (used for round message videos and avatars) has hard constraints that aren't loud in the docs:
- Codec: H.264 (libx264). HEVC, VP9, AV1 all rejected.
- Container: MP4 with faststart (moov atom at front).
- Resolution: 800x800 square. Non-square gets letterboxed or rejected.
- Duration: max 10 seconds for avatars, 60 for video notes.
- Size: max 2 MB for avatars, 8 MB for notes.
- Pixel format: yuv420p. yuv422p and 10-bit will fail on some Android clients.
- Audio: must be absent. Not muted, absent. A silent track still causes rejections.
If any one of these is off, the upload either fails silently or the video plays broken on some clients but not others. Apple's HEVC defaults break three of them at once: wrong codec, wrong resolution (1080x1920 portrait), and an AAC track.
The ffmpeg Pipeline
I went through a few iterations. The naive ffmpeg -i in.mov -c:v libx264 out.mp4 works for codec but ignores resolution, audio, and duration. Here's what landed in production:
ffmpeg -y -i input.mov \
-t 10 \
-vf "cropdetect=24:16:0,scale=800:800:force_original_aspect_ratio=increase,crop=800:800" \
-c:v libx264 \
-preset veryfast \
-profile:v baseline \
-level 3.1 \
-pix_fmt yuv420p \
-movflags +faststart \
-an \
-b:v 700k \
output.mp4
A walk through the flags:
-
-t 10caps duration. Telegram truncates anything longer, sometimes badly. - The scale + crop chain is the trick. Scale so the smaller side hits 800, then center-crop. Result is always 800x800, never stretched.
-
baselineprofile withlevel 3.1keeps the file decodable on older Android. Main and high profiles give marginal size wins but lock out some clients. -
pix_fmt yuv420pis mandatory. iPhone HEVC ships in yuv420p10le by default, which Telegram cannot decode for video notes. -
+faststartmoves the moov atom to the front so the file streams without waiting for the full download. -
-anstrips audio. Crucial. A silent AAC track is enough to fail the upload. -
-b:v 700kaims for under 2 MB at 10 seconds. For shorter clips I bump to 1M.
The cropdetect pass adds maybe 200 ms on a 10-second clip. Worth it because random vertical iPhone videos otherwise come out with black bars baked in.
The aiogram 3 Handler
The bot side is small. aiogram 3 has a clean async API, and I leaned on it. The handler:
from aiogram import F, Router
from aiogram.types import Message, FSInputFile
import asyncio, tempfile, pathlib
router = Router()
FFMPEG_CMD = [
"ffmpeg", "-y", "-i", "{input}",
"-t", "10",
"-vf", "cropdetect=24:16:0,scale=800:800:force_original_aspect_ratio=increase,crop=800:800",
"-c:v", "libx264", "-preset", "veryfast",
"-profile:v", "baseline", "-level", "3.1",
"-pix_fmt", "yuv420p",
"-movflags", "+faststart",
"-an", "-b:v", "700k",
"{output}",
]
@router.message(F.video | F.video_note | F.animation | F.document)
async def convert(msg: Message) -> None:
file = msg.video or msg.video_note or msg.animation or msg.document
if not file:
return
with tempfile.TemporaryDirectory() as tmp:
tmp = pathlib.Path(tmp)
src = tmp / "in.mp4"
dst = tmp / "out.mp4"
await msg.bot.download(file, destination=src)
cmd = [a.format(input=str(src), output=str(dst)) for a in FFMPEG_CMD]
proc = await asyncio.create_subprocess_exec(
*cmd,
stdout=asyncio.subprocess.DEVNULL,
stderr=asyncio.subprocess.PIPE,
)
_, err = await proc.communicate()
if proc.returncode != 0:
await msg.reply(f"ffmpeg failed: {err.decode()[-300:]}")
return
if dst.stat().st_size > 2 * 1024 * 1024:
await msg.reply("Output over 2 MB, try a shorter clip.")
return
await msg.reply_video_note(FSInputFile(dst))
A few details that bit me:
-
F.documentis needed because Telegram routes iPhone HEVC clips as documents when the client can't decode them inline. Without it the bot looks deaf on exactly the input it exists to handle. - The output gets sent as a video note, not a video. The user then forwards that to themselves and uses the menu to set it as an avatar. Telegram does not expose a direct "set avatar" API for bots.
- Subprocess uses
asyncio.create_subprocess_exec, notrun, otherwise the event loop blocks for the full encode duration. On a single-core VPS with five concurrent conversions that gets ugly fast.
Packaging It as a Bot
The full bot lives at https://t.me/LiveAvaBot?start=devto_article_20260619. It does the above plus:
- Polling instead of webhook. I run polling on a Hetzner CX22 because webhook setup with Cloudflare proxy and Telegram IP allowlists was more pain than the latency win.
- A small SQLite DB tracks conversions per user for rate limiting.
- Telegram Stars for paid extras (longer clips, keeping a separate audio file alongside the silent video).
- Systemd service with
Restart=on-failureand aMemoryHigh=512Mcap because ffmpeg occasionally spikes.
The whole thing is around 600 lines of Python. The hard part wasn't the code, it was figuring out the silent-failure modes.
Edge Cases I Hit
ProRes clips from cameras need an extra -vn pre-strip to drop metadata streams or libx264 throws a fit. Live Photos come in as a .mov with a HEIC sibling, so the bot only converts the .mov. Slow-mo iPhone clips use variable frame rate, which confuses cropdetect; adding -vsync cfr -r 30 before the filter chain fixes it. Very short clips (under 1 second) make the bitrate calculation misfire, so I clamp output size after the fact and warn the user.
What's next: a faster path for short clips that skips cropdetect, and supporting circle masking client-side because some Android clients render the square edges differently from iOS.
Built by me, @LiveAvaBot.
Top comments (0)