DEV Community

Asen Mitrev
Asen Mitrev

Posted on

Self-hosted video creation is coming

Day 10 of migrating video cration to an entirely self-hosted model. The goal is to open-source video compilation with AI.

The video below was made using only 2 API dependencies. 11labs for voice and Vertex AI for multimodal embeddings.

The rest is locally run. LLM is Qwen 3.6 27B. Impeccable for agentic tasks like scriptwriting and RAG.

Hoping to switch to Qwen VL embeddings next, so embedding costs go to the local power plant instead. At a significant discount.

Still no capable open source model for text-to-speech, although if you know one, drop it in a comment. It's the last missing piece.

Top comments (0)