The Problem
I tried setting a video avatar on Telegram using a clip from my iPhone. Telegram accepted the upload, the progress bar finished, then nothing happened. No error, no toast, no avatar. The video sat in chat as a regular file.
After digging into Telegram's behavior I found the silent rejection rule: video avatars must be exactly 800x800, H.264 video, no audio track, under 10 seconds, under 2MB. iPhone videos are HEVC (H.265) by default since iOS 11, usually portrait 1080x1920, with audio, and often longer than 10s. Every single criterion fails.
The frustrating part is the silence. Telegram doesn't say "wrong codec" or "too long". The upload just doesn't apply. You think you did something wrong.
So I built a bot.
What Telegram Actually Wants
The spec (you find it in the docs for setProfilePhoto with the animation field, plus reverse-engineered details from people complaining on forums) is strict:
- Container: MP4
- Video codec: H.264 (libx264)
- Pixel format: yuv420p (not yuv420p10le, not yuv422)
- Resolution: exactly 800x800, square
- Duration: at most 10 seconds
- File size: at most 2MB
- Audio: must be absent, not muted, fully removed
- Faststart:
moovatom at the start of the file
The faststart bit caught me out. ffmpeg by default puts the moov atom at the end of the file. Telegram's parser appears to give up rather than seek backwards. Adding -movflags +faststart was the single change that fixed half of my early test cases.
The FFmpeg Pipeline
I let ffmpeg do all the heavy lifting. I just wrote the wrapper. The pipeline is cropdetect, then re-encode with the exact flags Telegram needs.
# Optional first pass: detect the largest centered crop
ffmpeg -i input.mov -vf "cropdetect=24:16:0" -f null - 2>&1 \
| grep -oP 'crop=\K[0-9:]+' | tail -1
# Main pass: crop to square, scale to 800x800, transcode, strip audio
ffmpeg -i input.mov \
-vf "crop=in_h:in_h,scale=800:800,format=yuv420p" \
-c:v libx264 -preset veryfast -crf 28 \
-an \
-t 10 \
-movflags +faststart \
-y output.mp4
A few notes on the flags:
-
crop=in_h:in_hcrops to a square using the input height as both dimensions. For portrait clips this gives a centered square crop without manually computing offsets. -
scale=800:800resizes to the exact target. No aspect-ratio preservation, we already cropped to square. -
format=yuv420pforces 8-bit 4:2:0. iPhone HEVC is often 10-bit (yuv420p10le) which Telegram does not accept. -
-androps audio entirely. -
-t 10caps duration at 10 seconds. -
-crf 28is aggressive but keeps the file under 2MB for typical 10s clips. For 5s clips I drop to crf 23. -
+faststartis the magic flag mentioned above.
For files that still exceed 2MB after the first encode I bisect on CRF (28, 30, 32, 34) until the size fits. Crude, but it works for the long tail of weird inputs.
The Aiogram 3 Handler
The bot side is small. Aiogram 3 routes any video, video note, animation, or document to one converter:
from aiogram import Router, F
from aiogram.types import Message, FSInputFile
from pathlib import Path
import asyncio, tempfile
router = Router()
@router.message(F.video | F.video_note | F.animation | F.document)
async def convert(msg: Message):
file = msg.video or msg.video_note or msg.animation or msg.document
if not file:
return
with tempfile.TemporaryDirectory() as tmp:
src = Path(tmp) / "in.mp4"
dst = Path(tmp) / "out.mp4"
f = await msg.bot.get_file(file.file_id)
await msg.bot.download_file(f.file_path, src)
proc = await asyncio.create_subprocess_exec(
"ffmpeg", "-i", str(src),
"-vf", "crop=in_h:in_h,scale=800:800,format=yuv420p",
"-c:v", "libx264", "-preset", "veryfast", "-crf", "28",
"-an", "-t", "10", "-movflags", "+faststart",
"-y", str(dst),
stdout=asyncio.subprocess.DEVNULL,
stderr=asyncio.subprocess.DEVNULL,
)
await asyncio.wait_for(proc.wait(), timeout=30)
if dst.exists() and dst.stat().st_size < 2 * 1024 * 1024:
await msg.answer_video(FSInputFile(dst))
else:
await msg.answer("File too big after encode, try a shorter clip.")
asyncio.wait_for with a 30-second timeout is important. ffmpeg occasionally hangs on malformed input (especially screen recordings with odd codecs), and you don't want a stuck process per user piling up.
Shipping It as @liveavabot
I wrapped this into @LiveAvaBot. The user sends a video, the bot replies with the converted MP4, the user forwards that to "Edit profile photo" in Telegram settings. That's the entire flow. No login, no signup, no settings menu.
Practical bits I added on top:
- Per-user queue, one ffmpeg process per chat at a time, so spammy uploads don't fork-bomb the server.
- 30-second ffmpeg timeout (above), with a friendly error if it trips.
- Telegram Stars payments for batch conversions, mostly experimental, not a serious revenue path yet.
- Logs of input format so I can see what people actually throw at it (mostly iPhone HEVC, occasional Android H.264, a surprising number of screen recordings).
Current state: 84 users, slow organic growth from word of mouth.
Edge Cases and Lessons
-
HDR (HLG, Dolby Vision) from newer iPhones produces washed-out output when transcoded naively. Adding
colorspace=bt709:iall=bt709:fast=1to the filter chain fixes it for most clips. - Vertical videos under 10 seconds sometimes get over-cropped if the subject sits near the top of the frame. Running cropdetect first costs an extra ffmpeg pass but produces noticeably better framing on portraits.
-
GIFs route through the same pipeline. ffmpeg reads them fine. The quirk is they often have no duration set in the container, so
-t 10still applies correctly. - 2MB limit is the real ceiling. Codec and resolution are easy. Staying under 2MB on a 10-second 800x800 H.264 clip means crf 28 or higher, which looks acceptable for an avatar but would be ugly at full screen.
- Screen recordings are the worst case. Variable framerate, weird color spaces, sometimes 10-bit. The bisect-on-CRF fallback exists mainly for these.
Next on the list is detecting "this video is already valid" and skipping the re-encode entirely. About 5% of uploads are already 800x800 H.264 muted (likely from other converters), and re-encoding them is wasted CPU and quality loss.
If you've got an iPhone clip that won't stick as a TG avatar, throw it at the bot and see what comes back: https://t.me/LiveAvaBot?start=devto_article_20260526
Built by me, @liveavabot.
Top comments (0)