DEV Community

Glenn Sonna for Xybrid

Posted on • Originally published at xybrid.ai

Run AI Models On-Device — Zero Config, Five Minutes

You already know why on-device AI matters. Privacy, latency, cost. You've read the guides.

Now you want to actually do it. Here's what that looks like with Xybrid — no tensor shapes, no preprocessing scripts, no ML expertise.


Install

# macOS / Linux
curl -sSL https://raw.githubusercontent.com/xybrid-ai/xybrid/master/install.sh | sh
Enter fullscreen mode Exit fullscreen mode
# Windows (PowerShell)
irm https://raw.githubusercontent.com/xybrid-ai/xybrid/master/install.ps1 | iex
Enter fullscreen mode Exit fullscreen mode

Text-to-Speech

xybrid run --model kokoro-82m --input "Hello from the edge" --output hello.wav
Enter fullscreen mode Exit fullscreen mode

That's it. Xybrid resolved the model from the registry, downloaded it, ran inference, and saved a WAV file. You configured nothing.

Kokoro is an 82M parameter TTS model with 24 voices. First run downloads ~80MB and caches it locally. Subsequent runs are instant.

Speech Recognition

xybrid run --model whisper-tiny --input recording.wav
Enter fullscreen mode Exit fullscreen mode

Whisper Tiny transcribes audio in real-time on any modern laptop. Outputs plain text.

Text Generation

xybrid run --model qwen3.5-0.8b --input "Explain quantum computing in one sentence"
Enter fullscreen mode Exit fullscreen mode

Qwen 3.5 0.8B runs locally via llama.cpp. 201 languages, fits in 500MB quantized.

Browse the Registry

xybrid models list
Enter fullscreen mode Exit fullscreen mode

25+ models, all hosted on HuggingFace, downloaded on-demand, cached locally:

Model Task Size Notes
kokoro-82m Text-to-Speech 82M 24 voices, high quality
kitten-tts-nano-0.8 Text-to-Speech 15M Ultra-lightweight
qwen3-tts-0.6b Text-to-Speech 600M Multilingual
whisper-tiny Speech Recognition 39M Real-time, multilingual
wav2vec2-base-960h Speech Recognition 95M CTC-based
lfm2.5-350m Text Generation 354M 9 languages, edge-optimized
smollm2-360m Text Generation 360M Best tiny LLM
qwen3.5-0.8b Text Generation 800M 201 languages
gemma-4-e2b Text Generation 5.1B Multimodal
mistral-7b Text Generation 7B Function calling

Beyond the CLI

The CLI is the fastest way to evaluate. When you're ready to integrate into an app, Xybrid has SDKs for Flutter, Swift, Kotlin, Unity, and Rust — same models, same behavior, every platform.


Xybrid is in beta (v0.1.0-beta9), open-source under Apache 2.0.

GitHub: github.com/xybrid-ai/xybrid


Questions? Drop them in the comments — happy to help you get running.

Top comments (0)