The Pain
You record a quick clip on iPhone, drag it into Telegram as a profile video, and Telegram just... drops it. No error popup. No "wrong format" toast. The upload spinner finishes, your avatar stays the old one. I hit this exact thing trying to set a custom video avatar last summer, and a friend hit it the same week with a clip from her iPhone 13.
The issue is HEVC. Since iOS 11, the iPhone camera records H.265/HEVC by default. Telegram's video-avatar endpoint accepts only H.264 in an MP4 container. The mobile client silently refuses anything else for that specific endpoint, even though normal video messages accept HEVC fine. The result: a UX dead end the user has zero way to diagnose.
So I built a bot around fixing exactly this.
What Telegram Actually Requires
The video-avatar spec, derived from setProfilePhoto and the observed behavior of MTProto clients, looks like this:
- Container: MP4 with the
moovatom at the start (faststart). - Video codec: H.264, yuv420p pixel format, 8-bit.
- Resolution: 800x800, square, dead exact.
- Duration: at most 10 seconds.
- File size: under 2 MB.
- Audio: none. Strip the audio track entirely.
Miss any one of those and the client silently fails. The 800x800 requirement is the most surprising part, because Telegram's docs phrase it as "recommended" while the client treats it as mandatory.
Solving It With ffmpeg
ffmpeg is doing the heavy lifting here, I just wrote the wrapper. Two passes: first a cropdetect to figure out where the content actually sits, then an encode pass that crops, scales, drops audio, and writes a faststart MP4.
# pass 1: figure out the crop box
ffmpeg -i input.mov -vf cropdetect=24:16:0 -t 3 -f null - 2>&1 \
| grep -oE 'crop=[0-9:]+' | tail -1
# => crop=1080:1080:0:420
# pass 2: encode to the TG video-avatar spec
ffmpeg -i input.mov \
-t 9.8 \
-vf "crop=1080:1080:0:420,scale=800:800,format=yuv420p" \
-c:v libx264 -profile:v high -level 4.0 \
-preset veryfast -crf 26 \
-movflags +faststart \
-an \
-y output.mp4
Notes that bit me:
-
-t 9.8instead of-t 10. Telegram rounds duration and rejects clips that present as 10.001s. -
format=yuv420p. iPhone HEVC is often yuv420p10le (10-bit). H.264 decoders on older Android clients choke on 10-bit. Always force 8-bit. -
-anstrips audio. The container is still MP4, just with no audio track. Telegram does not want an empty audio track, it wants zero audio tracks. -
-movflags +faststart. Without this themoovatom lands at the end of the file and the Telegram uploader rejects the whole thing as corrupt.
For the size cap, -crf 26 lands most clips under 2 MB after the 800x800 downscale. If it overshoots, I re-encode at -crf 30 and try again.
The aiogram 3 Handler
The bot is built on aiogram 3 with a tiny FSM. The handler receives video, GIF, or video_note, dispatches to ffmpeg in a background thread, then sends the result back as a document so Telegram does not re-compress it on the way through.
from aiogram import Router, F
from aiogram.types import Message, FSInputFile
from aiogram.enums import ContentType
import asyncio, tempfile, pathlib
router = Router()
@router.message(F.content_type.in_({
ContentType.VIDEO,
ContentType.ANIMATION,
ContentType.VIDEO_NOTE,
ContentType.DOCUMENT,
}))
async def handle_video(message: Message):
file = (
message.video
or message.animation
or message.video_note
or message.document
)
if not file:
return
with tempfile.TemporaryDirectory() as td:
td = pathlib.Path(td)
src = td / "in.bin"
dst = td / "out.mp4"
await message.bot.download(file, destination=src)
status = await message.answer("converting...")
rc = await asyncio.to_thread(run_ffmpeg, str(src), str(dst))
if rc != 0 or not dst.exists():
await status.edit_text("ffmpeg failed, source may be corrupt.")
return
await message.answer_document(
FSInputFile(dst),
caption="set this as your profile video in telegram settings."
)
await status.delete()
run_ffmpeg is a blocking subprocess call that runs the two passes from the previous section. Keeping it inside asyncio.to_thread means a slow clip does not block the polling loop, which matters when twenty users hit the bot at the same time.
One detail: I send the result as answer_document, not answer_video. If you send it as a video, mobile clients will sometimes re-encode the file during upload-by-reference, which can break the exact 800x800 spec. Documents are passed through byte-for-byte.
Packaging It As @liveavabot
I wrapped this whole pipeline into a Telegram bot, LiveAvaBot. Send any video, GIF, video message, or screen recording, and you get back an MP4 you can set in Telegram Settings (Edit Profile, Set Public Photo, Video). No login or API key needed, and no Telegram Premium required.
Stack:
- Python 3.11 with aiogram 3 for the bot framework.
- ffmpeg 6 inside a thin Docker layer.
- SQLite for usage counters, no user content stored, just a counter per chat_id.
- systemd timer for daily ops reports, because I'm too lazy to set up Grafana for a 200-user side project.
Current stats while I'm writing this: 169 users, a handful of conversions per day, running on a single Hetzner CX11. Free to use with no ads or paywall. If it grows enough that I need to add Stars payments I will, but a bot that does one thing well should stay simple.
Edge Cases And What I Want To Fix Next
What still bites me:
- 4K HDR clips from iPhone 15 Pro. The HDR tone-mapping to SDR needs
zscale, which is not in the default ffmpeg static build. Workaround for now: I detect the color space and reject with a friendly message. - Clips longer than 10 seconds. I trim from the start. Should probably let users pick the start point through an inline button after upload.
- Vertical videos shorter than 800px. Currently I upscale, which looks soft. Considering padding to square with a blurred background instead.
If you have a weird video the bot refuses, send it over and ping me. Sample files are the best bug reports.
Built by me, @liveavabot is the result. The code above is roughly the real handler with the auth and logging stripped out. Happy to answer questions in the comments.
Top comments (0)