Last week someone sent me an iPhone clip and asked why Telegram wouldn't take it as a video avatar. The upload went through, the spinner finished, and then nothing. No error, no avatar, old photo still there. That silent failure is the whole reason this article exists.
The Silent Failure
iPhones have recorded in HEVC (H.265) by default since iOS 11. It saves storage, and it breaks in a lot of places that still expect H.264. Telegram's video avatar feature is one of them. Most clients accept the upload, show progress, and then quietly keep your old avatar. There's no toast and no log entry telling you the codec was the problem.
I lost an hour on this before I pulled the file into ffprobe and saw hev1 in the stream info. Once you know what Telegram actually wants, the fix is one ffmpeg command. Getting to that command took some digging, so here's all of it.
What Telegram Actually Requires
The spec is scattered across docs and client source code, so here it is in one place. A video avatar must be:
- H.264 video in yuv420p pixel format (HEVC and 10-bit formats get dropped)
- square, up to 800x800
- 10 seconds or shorter
- no audio track at all (not muted, the stream has to be removed)
- roughly 2MB or less, bigger files start failing on some clients
Miss any one of these and you get the silent rejection. A fresh iPhone clip fails the first check. A 16:9 clip fails the second. A clip with sound fails the fourth even if everything else is right. Most real-world files fail three of the five at once.
Cropping and Encoding with FFmpeg
Two problems to solve: get a square frame without squashing the subject, and re-encode to spec. I run cropdetect first because GIFs and re-uploaded clips often carry letterbox bars, then take the largest centered square.
# pass 1: detect letterbox bars (common on GIFs and re-encoded clips)
ffmpeg -i input.mov -vf cropdetect=24:16:0 -t 3 -f null - 2>&1 \
| grep -o 'crop=[0-9:]*' | tail -1
# pass 2: center-square crop, scale, encode to spec
ffmpeg -i input.mov -t 9.5 \
-vf "crop='min(iw,ih)':'min(iw,ih)',scale=800:800,format=yuv420p" \
-c:v libx264 -profile:v baseline -preset veryfast -b:v 1200k \
-movflags +faststart -an output.mp4
What each piece is doing:
-
crop='min(iw,ih)':'min(iw,ih)'takes the largest centered square from any aspect ratio. If pass 1 found bars, apply that crop first, then the square crop. -
scale=800:800,format=yuv420phits the resolution and pixel format requirements in one filter chain. -
-anremoves the audio stream entirely. Muting is not enough, the track has to be gone. -
-movflags +faststartmoves the moov atom to the front of the file so clients can play it while downloading. -
-t 9.5trims under the 10 second cap with a little margin. Exactly 10.0 is risky because container duration rounding can push it over. - Bitrate math: 2MB over 10 seconds is about 1600 kbps total budget. 1200k video leaves headroom for container overhead.
The Bot Handler in aiogram 3
Running this by hand gets old fast, so I wrapped it in a bot. Minimal aiogram 3 handler:
import asyncio, os, tempfile
from aiogram import Router, F
from aiogram.types import Message, FSInputFile
router = Router()
async def convert(src: str, dst: str) -> bool:
vf = "crop='min(iw,ih)':'min(iw,ih)',scale=800:800,format=yuv420p"
proc = await asyncio.create_subprocess_exec(
"ffmpeg", "-y", "-i", src, "-t", "9.5",
"-vf", vf, "-c:v", "libx264", "-profile:v", "baseline",
"-b:v", "1200k", "-movflags", "+faststart", "-an", dst,
stdout=asyncio.subprocess.DEVNULL,
stderr=asyncio.subprocess.DEVNULL,
)
await proc.wait()
return proc.returncode == 0 and os.path.getsize(dst) > 0
@router.message(F.video | F.animation)
async def handle_media(msg: Message):
media = msg.video or msg.animation
if media.file_size and media.file_size > 50 * 1024 * 1024:
await msg.answer("50MB max, sorry.")
return
with tempfile.TemporaryDirectory() as tmp:
src = os.path.join(tmp, "in")
dst = os.path.join(tmp, "out.mp4")
await msg.bot.download(media, destination=src)
if not await convert(src, dst):
await msg.answer("Conversion failed. Is this a valid video?")
return
await msg.answer_video(FSInputFile(dst))
Two details that matter. create_subprocess_exec keeps ffmpeg off the event loop, so the bot stays responsive while a conversion runs. And F.animation catches GIFs, which Telegram wraps as mp4 animations internally, so the same pipeline handles both.
Shipping It as @liveavabot
I packaged this as @LiveAvaBot. Send it any video or GIF, it returns the compliant mp4, and you set that file as your profile video from Telegram settings. 186 people have used it so far, and the most common input is exactly the case that started this: HEVC straight off an iPhone camera roll.
Edge Cases I Hit Along the Way
- iPhone rotation. Portrait clips carry a display matrix instead of rotated pixels. ffmpeg 5+ autorotates on decode, older builds need an explicit
transposefilter or your output comes out sideways. - GIFs with odd frame rates. Adding
fps=30to the filter chain keeps durations from drifting after encode. - The 2MB budget. Long busy clips blow past it at 1200k, so I retry at 900k, then 700k, before giving up.
- HDR clips come out washed out after the yuv420p conversion. I don't tone map yet, it's on the list.
- Animated stickers and WEBP are a different pipeline entirely. Not handled, probably never will be.
ffmpeg is doing the heavy lifting here, I just wrote the wrapper and the retry logic. Disclosure: built by me, @liveavabot.
If you've fought Telegram's other undocumented media specs (sticker WEBM requirements, anyone?), I'd like to hear how you figured out what the client actually accepts.
Top comments (0)