Have you ever wondered why the same prompt costs more in one language than another? Or why a model feels "smarter" in English but struggles with Arabic or Chinese?
When working with LLMs, we often treat the response as a black box. We see the output, but we don't see the mechanicsβthe tokenization, the side-by-side comparison of different model families, or how different writing systems affect performance.
I built LLMxRay to pull back the curtain.
What is LLMxRay?
LLMxRay is an open-source observability tool designed to help developers inspect how different LLMs handle the exact same prompt in real-time. Whether you are using local models via Ollama/LM Studio or cloud-based APIs, LLMxRay gives you a "side-by-side" X-ray view of your prompt's journey.
Why use it?
Multi-Model Comparison: Run one prompt against multiple models simultaneously. See how Llama 3 compares to Mistral or GPT-4o in one view.
Multilingual Deep-Dive: This was a big focus for me. The tool supports 4 languages:
- English πΊπΈ
- French π«π·
- Arabic πΈπ¦ (RTL support)
- Chinese π¨π³
Tokenization Transparency: See exactly how your text is being chopped up into tokens. This is crucial for debugging cost, context window limits, and model "reasoning" quality across different writing systems.
Why 4 Languages?
Tokenization isn't equal. A single concept might be 1 token in English but 3 tokens in another language. By supporting Latin, RTL (Arabic), and character-based (Chinese) scripts, LLMxRay lets you see the economic and technical difference of running multilingual apps.
Try it out
The project is early-stage and open for feedback! You can connect it to your local environment or use your API keys to start comparing models immediately.
π Check out the repo here:
https://github.com/LogneBudo/llmxray
or website and docs here:
https://lognebudo.github.io/llmxray/
Iβd love to hear from the DEV community:
Which model families do you want to see compared next?
Are there specific visualizations that would help your LLM workflow?
Drop a comment below or open an issue on GitHub! π

Top comments (0)