DEV Community

liveavabot
liveavabot

Posted on

Convert iPhone HEVC Videos to Telegram Video Avatars with ffmpeg

Why Your iPhone Video Won't Work as a Telegram Avatar

Telegram video avatars have strict requirements. The file must be H.264, 800×800, under 10 seconds, under 2MB, with no audio track. Most people don't know this until they try uploading a video and get a vague error, or worse, silence.

iPhone records in HEVC (H.265) by default since iOS 11. Telegram accepts HEVC for regular video messages but silently rejects it for video avatars. You upload, nothing changes, no useful error message.

I ran into this while setting up a profile avatar and spent 20 minutes debugging before I realized the codec was the problem.

What Telegram Requires

The video avatar spec, confirmed through testing:

  • Container: MP4
  • Codec: H.264 (libx264), yuv420p pixel format
  • Resolution: 800×800 (square, not 16:9)
  • Duration: 10 seconds maximum
  • File size: 2MB maximum
  • Audio: must be absent

The yuv420p requirement is the sneaky one. Encode with yuv444p or leave the pixel format at the camera default, and Telegram will reject the file without telling you why.

The ffmpeg Pipeline

The tricky part is the square crop. Most videos are 16:9 or 9:16. You need to find the largest centered square, crop to it, scale to 800×800, then re-encode.

The command I use:

# Inspect source (optional)
ffprobe -v error -select_streams v:0 \
  -show_entries stream=width,height,codec_name \
  -of default=noprint_wrappers=1 input.mp4

# Convert to Telegram video avatar format
ffmpeg -i input.mp4 \
  -vf "crop=ih:ih,scale=800:800,fps=30,format=yuv420p" \
  -c:v libx264 \
  -preset slow \
  -crf 28 \
  -movflags +faststart \
  -an \
  -t 10 \
  output.mp4
Enter fullscreen mode Exit fullscreen mode

Filter chain breakdown:

crop=ih:ih takes a centered square using the video height as both dimensions. Works for landscape source. For portrait or mixed input, use crop=min(iw\,ih):min(iw\,ih).

scale=800:800 resizes to exactly 800×800.

fps=30 normalizes frame rate. Telegram is picky about variable frame rates, and iPhones record VFR in some modes.

format=yuv420p forces the required pixel format.

-an removes audio entirely.

-t 10 hard-caps at 10 seconds.

-movflags +faststart moves the moov atom to the start of the file, required for streaming playback.

CRF 28 with slow preset gives a reasonable size-to-quality tradeoff for the 2MB limit. High-motion clips may need CRF 30-32 to stay under that ceiling.

ffmpeg handles HEVC decoding automatically. No extra flags needed for iPhone input.

The aiogram 3 Handler

A simplified version of the bot handler:

import asyncio
import tempfile
from pathlib import Path

from aiogram import Bot, Dispatcher, F
from aiogram.types import Message, BufferedInputFile

bot = Bot(token="YOUR_TOKEN")
dp = Dispatcher()

async def convert_to_avatar(input_path: Path, output_path: Path) -> bool:
    cmd = [
        "ffmpeg", "-y", "-i", str(input_path),
        "-vf", "crop=ih:ih,scale=800:800,fps=30,format=yuv420p",
        "-c:v", "libx264", "-preset", "slow", "-crf", "28",
        "-movflags", "+faststart", "-an", "-t", "10",
        str(output_path),
    ]
    proc = await asyncio.create_subprocess_exec(
        *cmd,
        stdout=asyncio.subprocess.DEVNULL,
        stderr=asyncio.subprocess.DEVNULL,
    )
    await proc.wait()
    return proc.returncode == 0 and output_path.stat().st_size <= 2 * 1024 * 1024

@dp.message(F.video | F.animation | F.document)
async def handle_video(message: Message):
    await message.answer("Converting, hang on...")

    file_obj = message.video or message.animation or message.document
    file = await bot.get_file(file_obj.file_id)

    with tempfile.TemporaryDirectory() as tmp:
        input_path = Path(tmp) / "input.mp4"
        output_path = Path(tmp) / "output.mp4"
        await bot.download_file(file.file_path, destination=str(input_path))
        ok = await convert_to_avatar(input_path, output_path)
        if not ok:
            await message.answer("Conversion failed or output exceeded 2MB. Try a shorter clip.")
            return
        data = output_path.read_bytes()
        await message.answer_video(
            BufferedInputFile(data, filename="avatar.mp4"),
            caption="Done. Set this as your Telegram video avatar in profile settings.",
        )
Enter fullscreen mode Exit fullscreen mode

F.document catches videos sent as files rather than Telegram video messages. iPhone users sometimes share via "Send as File" to preserve quality, which changes the message type.

The size check after encoding matters. A 60-second 4K source at CRF 28 can still exceed 2MB. The handler returns a useful error instead of uploading an oversized file that Telegram would silently reject.

Packaging It as @liveavabot

I wrapped this pipeline into a production bot with a few additions: deduplication by content hash to avoid processing the same file twice, per-user rate limiting, and retry logic for ffmpeg edge cases.

The bot also handles Telegram's animation type, which is how Telegram stores GIFs internally as MP4 files. The same ffmpeg pipeline works without changes.

Currently at 99 users. Running on a small VPS with system ffmpeg.

Try it: https://t.me/LiveAvaBot?start=devto_article_20260601

Built by me. Architecture: aiogram 3, asyncio subprocesses for ffmpeg, PostgreSQL for deduplication state.

Edge Cases Worth Knowing

Variable frame rate: iPhone records VFR in some modes. The fps=30 filter normalizes this. Without it, some encoders produce stuttering on certain devices.

Square GIFs with letterboxing: Some animated GIFs are already square but have letterboxing baked into the frame. The crop=ih:ih logic doesn't handle this well. I switched to min(iw,ih) as the crop dimension to handle both landscape and portrait sources correctly.

The 2MB ceiling: At 10 seconds and 800×800, you have roughly 200KB/s budget. Fast-motion clips hit this reliably. I haven't added two-pass encoding yet, so the current bot just asks users to try a shorter clip.

Corrupted audio streams: Some MP4 files labeled as GIFs carry corrupted audio tracks. Adding -ignore_unknown to the ffmpeg flags handles these without aborting the whole encode.

"Sent as File" videos: When a user sends a video as a document, Telegram skips its own transcoding. The file arrives as-is, usually raw HEVC from the camera. The F.document handler catches this case.

What's Next

Two-pass encoding for the oversized case is on the list. After that, possibly a web frontend that doesn't require Telegram at all.

ffmpeg handles the hard parts. I wrote the wrapper and the bot scaffolding.

Top comments (0)