I shipped a Telegram bot that converts any video into a video avatar. The ffmpeg command took an afternoon. Getting every output under 2 MB took a week.
The spec
Telegram video avatars have rigid constraints: 800×800 pixels (square), H.264 codec, yuv420p pixel format, 10 seconds max, no audio, 2 MB file size ceiling, MP4 container with faststart.
Miss any of these and Telegram silently rejects the file. No error message, no warning. The upload just fails.
The resolution, codec, and duration rules are straightforward ffmpeg flags. The 2 MB ceiling is where things get interesting.
The bitrate math
2 MB equals roughly 16.7 million bits. Over 10 seconds, that gives you about 1,677 kbps for everything: video stream, container overhead, metadata.
I target 900 kbps for the video stream with a max rate of 1200 kbps. That produces roughly 1.1 MB for a 10-second clip. Plenty of headroom.
Why not push closer to the limit? Because rate control isn't truly constant. Complex frames spike above the target. The maxrate flag caps the peaks, but I'd rather have a comfortable margin than chase the ceiling and occasionally overshoot.
Here's the math for different durations:
- 3 seconds at 900 kbps: ~330 KB
- 5 seconds at 900 kbps: ~550 KB
- 10 seconds at 900 kbps: ~1.1 MB
- 10 seconds at 1400 kbps: ~1.7 MB (tight)
The 900k target works for almost everything. The only time I've seen it produce bad output is extremely dark or noisy footage, where H.264 needs more bits to avoid banding. For avatars (usually faces, pets, memes), 900k looks good.
The ffmpeg command
This is the full pipeline I run for every conversion:
ffmpeg -y -i input.mp4 -t 10 \
-vf "crop=min(iw\,ih):min(iw\,ih),scale=800:800,fps=30,format=yuv420p" \
-c:v libx264 -preset medium \
-b:v 900k -maxrate 1200k -bufsize 2M \
-an -movflags +faststart \
output.mp4
crop=min(iw\,ih):min(iw\,ih) squares the video by cropping to the shorter dimension. Centers the crop automatically.
scale=800:800 resizes to Telegram's exact requirement.
fps=30 normalizes the frame rate. This matters more than you'd expect. iPhone slow-mo videos come in at 120 or 240 fps. Without this flag, you get a 10-second clip with 4x the frames, blowing the file size.
format=yuv420p forces the pixel format. Some inputs (screen recordings, ProRes) use yuv444p or yuv422p. Telegram needs 420p.
-b:v 900k -maxrate 1200k -bufsize 2M is the rate control trio. Target 900 kbps, peaks capped at 1200 kbps, VBV buffer 2 MB. The buffer size matters: too small and quality oscillates visibly between keyframes.
-an strips audio. Video avatars can't have it, and leaving it in wastes bytes.
-movflags +faststart moves the moov atom to the front of the file. Without this, Telegram sometimes can't seek or preview the video.
Running ffmpeg inside an async bot
The bot uses aiogram 3 with asyncio. Running synchronous ffmpeg subprocesses inside an async event loop needs some care.
import asyncio
import tempfile
from pathlib import Path
from aiogram import Router, F
from aiogram.types import Message, FSInputFile
router = Router()
_SEM = asyncio.Semaphore(4)
@router.message(F.video | F.animation | F.document)
async def on_video(message: Message):
file = message.video or message.animation or message.document
if not file:
return
async with _SEM:
with tempfile.TemporaryDirectory() as td:
inp = Path(td) / "in.bin"
out = Path(td) / "out.mp4"
await message.bot.download(file, destination=inp)
cmd = [
"ffmpeg", "-y", "-i", str(inp), "-t", "10",
"-vf", "crop=min(iw\\,ih):min(iw\\,ih),"
"scale=800:800,fps=30,format=yuv420p",
"-c:v", "libx264", "-preset", "medium",
"-b:v", "900k", "-maxrate", "1200k",
"-bufsize", "2M", "-an",
"-movflags", "+faststart", str(out),
]
proc = await asyncio.create_subprocess_exec(
*cmd,
stdout=asyncio.subprocess.DEVNULL,
stderr=asyncio.subprocess.PIPE,
)
_, stderr = await asyncio.wait_for(
proc.communicate(), timeout=60
)
if proc.returncode != 0:
await message.reply("Conversion failed, sorry.")
return
if out.stat().st_size > 2 * 1024 * 1024:
await message.reply("Output too large. Try a shorter clip.")
return
await message.answer_video(FSInputFile(out))
Three things worth calling out here.
asyncio.Semaphore(4) limits concurrent ffmpeg processes to four. Without it, a burst of messages spawns dozens of ffmpeg instances and your VPS runs out of memory. I learned this on a 2 GB droplet when someone forwarded a batch of 20 videos at once.
asyncio.wait_for with a 60-second timeout prevents zombie processes. Some corrupted inputs cause ffmpeg to hang indefinitely. The timeout kills them cleanly.
The explicit file size check after encoding is a safety net. If rate control fails on an unusual input (rare, but I've seen it with certain screen recordings), the bot catches it before sending an oversized file that Telegram would silently reject.
Edge cases that burned me
iPhone HEVC. About 68% of my users send videos from iPhones, which record in HEVC (H.265) by default. Telegram video avatars require H.264. The ffmpeg command handles this transparently because -c:v libx264 forces a full re-encode regardless of input codec. But if you're checking whether to re-encode, don't trust the file extension. Use ffprobe to read the actual codec.
Variable frame rate. Game capture software (OBS, ShadowPlay) often produces VFR files. The fps=30 filter normalizes this. Without it, ffmpeg's duration estimation goes wrong and you get truncated or weirdly stretched output.
Animated stickers. About 11% of users send Telegram animated stickers (WebM/VP9). These decode fine, but some stickers are a single frame. The output becomes a 10-second still image. Technically valid, not what anyone wants. I added a frame count check and return a warning for single-frame inputs.
GIFs with absurd frame counts. Some GIFs pack 500+ frames into 3 seconds. The fps=30 filter saves you here too, dropping excess frames before encoding. Without it, the encoder sees way too many frames and the output blows past 2 MB.
Try it
I packaged all of this as @liveavabot. Send any video, GIF, or sticker and it returns a properly formatted video avatar. 55 users so far, still iterating on edge cases.
Built by me, @liveavabot: https://t.me/LiveAvaBot?start=devto_article_20260516
Top comments (0)