I recorded a 5-second clip on my iPhone, opened Telegram, tried to set it as my video avatar, and nothing happened. No error, no warning. Telegram just quietly rejected the file and moved on.
Turns out the problem is HEVC. iPhones shoot in HEVC (H.265) by default since iPhone 7. Telegram video avatars require H.264. When you send an HEVC video as a profile avatar, Telegram's client silently drops it. No feedback, no conversion, just nothing.
I spent an evening figuring out the exact spec, wrote an ffmpeg pipeline to handle any input video, and then wrapped it in a Telegram bot with aiogram 3. Here's everything I learned.
What Telegram video avatars actually require
Telegram doesn't document this well, so I had to reverse-engineer it from failed uploads and the Telegram Desktop source.
- Codec: H.264 (libx264). HEVC, VP9, AV1 all get rejected silently.
- Resolution: Square. 800x800 is the documented max. Some clients accept larger, but 640x640 works on every device including old Android phones.
- Duration: 10 seconds max. The client loops the video, so shorter clips (3-7s) tend to look better.
- File size: Under 2 MB. This is the hard limit that catches most people.
- Audio: Must be stripped entirely. Even a silent audio track adds overhead and can cause issues on some clients.
- Pixel format: yuv420p. Some encoders default to yuv444p or yuv422p, and Telegram won't accept those.
- Container: MP4 with faststart (moov atom at the beginning so the client can preview before full download).
If any of these are wrong, Telegram does nothing. No error popup, no "unsupported format" message. The upload silently fails.
The ffmpeg pipeline
Here's the command I landed on after testing dozens of combinations:
ffmpeg -y -v error \
-i input.mov \
-t 9 \
-vf "crop='min(iw,ih)':'min(iw,ih)',scale=640:640:flags=lanczos,fps=30,format=yuv420p" \
-an \
-c:v libx264 -profile:v high -level 4.0 \
-preset medium -crf 23 \
-maxrate 1400k -bufsize 2800k \
-pix_fmt yuv420p \
-movflags +faststart \
output.mp4
Let me break down what each part does.
crop='min(iw,ih)':'min(iw,ih)' takes the center square from any aspect ratio. A 1920x1080 landscape video becomes 1080x1080 from the center. A 1080x1920 portrait video also becomes 1080x1080. This handles every phone orientation without needing to detect it first.
scale=640:640:flags=lanczos downscales to the target resolution. Lanczos resampling keeps text and sharp edges looking clean at small sizes. Bilinear would be faster but visibly blurry on a 640px square.
fps=30,format=yuv420p normalizes frame rate and pixel format. iPhones shoot at varying rates (24, 30, 60fps), and some screen recordings use odd color formats. Forcing 30fps and yuv420p ensures compatibility everywhere.
-an strips all audio tracks. Simple but easy to forget.
-c:v libx264 -profile:v high -level 4.0 encodes H.264 at High profile, Level 4.0. This combination works on every Telegram client I've tested, from Desktop to the oldest Android version I could find.
-crf 23 -maxrate 1400k -bufsize 2800k controls quality and file size. CRF 23 is the sweet spot for 640x640. At 9 seconds, this typically produces files between 400KB and 800KB, well under the 2MB limit. The maxrate cap prevents quality spikes from blowing the budget on complex scenes.
-movflags +faststart relocates the MP4 moov atom to the beginning of the file. Without this, Telegram can't preview the video until the full file downloads.
Wiring it up with aiogram 3
The bot side is straightforward. Accept a video or document, run ffmpeg, send back the converted file:
from aiogram import Router, F
from aiogram.types import Message, FSInputFile
import asyncio
from pathlib import Path
import uuid
router = Router()
@router.message(F.video | F.document | F.animation)
async def handle_video(msg: Message) -> None:
file_id = (msg.video or msg.document or msg.animation).file_id
src = Path(f"/tmp/{uuid.uuid4().hex}.mov")
dst = Path(f"/tmp/{uuid.uuid4().hex}.mp4")
await msg.bot.download(file_id, destination=src)
await msg.answer("Converting, one sec...")
vf = (
"crop='min(iw,ih)':'min(iw,ih)',"
"scale=640:640:flags=lanczos,"
"fps=30,format=yuv420p"
)
proc = await asyncio.create_subprocess_exec(
"ffmpeg", "-y", "-v", "error",
"-i", str(src), "-t", "9",
"-vf", vf, "-an",
"-c:v", "libx264", "-profile:v", "high", "-level", "4.0",
"-preset", "medium", "-crf", "23",
"-maxrate", "1400k", "-bufsize", "2800k",
"-pix_fmt", "yuv420p", "-movflags", "+faststart",
str(dst),
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
)
await proc.wait()
if proc.returncode == 0 and dst.exists():
await msg.answer_video_note(FSInputFile(dst))
await msg.answer(
"Done. Save this as your video avatar "
"in Settings > Profile."
)
else:
await msg.answer("Conversion failed. Try a different video?")
src.unlink(missing_ok=True)
dst.unlink(missing_ok=True)
A few things worth noting.
F.video | F.document | F.animation catches all three ways users send media in Telegram. Videos arrive as video, screen recordings sometimes arrive as document, and GIFs come as animation. Missing any of these means confused users asking why it doesn't work with their GIF.
answer_video_note sends the result as a circular video note. This gives users a preview of what the avatar will actually look like. You could also use answer_document to send the raw file for manual upload.
Cleanup matters. ffmpeg temp files pile up fast if you forget to unlink. On a VPS with limited disk, this becomes a problem within hours.
Packaging it as @liveavabot
I took this basic pipeline and turned it into @LiveAvaBot. The bot adds a few things on top of the bare conversion.
Concurrency control. ffmpeg is CPU-heavy. Running 10 simultaneous conversions on a 2-core VPS makes everything unusable. I use an asyncio.Semaphore to cap concurrent ffmpeg processes.
Freemium model. First two conversions are free. After that, users pay with Telegram Stars, Telegram's built-in payment system. No Stripe integration, no payment form. Users tap "Pay" in the chat, confirm with Google Pay or Apple Pay, done.
Error recovery. The production version retries with lower quality settings if the first attempt exceeds 2MB. Drop CRF from 23 to 26, reduce maxrate to 1000k, try again. This handles high-motion content like sports clips that compress poorly.
Edge cases that bit me
Vertical videos with black bars. Some screen recordings have black letterboxing baked into the video stream. The center crop takes those bars with it, giving you a square of mostly black. I added cropdetect as a pre-pass to find the actual content bounds before cropping.
GIFs with weird frame rates. Telegram GIF animations are actually silent MP4s internally, but the metadata sometimes reports 50fps or higher. The fps=30 filter handles this, but without it ffmpeg would generate way more frames than needed and blow past the 2MB limit.
0-byte files on error. When ffmpeg fails (corrupt input, unsupported codec), it sometimes creates a 0-byte output file. The dst.exists() check alone isn't enough. You need dst.stat().st_size > 0 in production.
yuv420p set twice on purpose. I set yuv420p in both the video filter chain (format=yuv420p) and as a top-level flag (-pix_fmt yuv420p). Belt and suspenders. Some ffmpeg builds ignore one or the other depending on the input format.
Built by me. @LiveAvaBot: send it any video, get back a Telegram-ready avatar. First two conversions are free.
Top comments (0)