Over the last few weeks I’ve been working on a project called CodexConvert.
It started as a simple idea:
What if we could convert entire codebases using multiple AI models — and automatically benchmark which one performs best?
So I built a tool that does exactly that.
🔁 Multi-Model Code Conversion
CodexConvert lets you run the same conversion task across multiple AI models at once.
For example:
Python → Rust
JavaScript → Go
Java → TypeScript
You can compare outputs side-by-side and immediately see how different models perform.
📊 Automatic Benchmarking
Each model output is evaluated automatically using three metrics:
✔ Syntax Validity
✔ Structural Fidelity
✔ Token Efficiency
Scores are normalized to a 0–10 scale, making it easy to compare models.
🏆 Built-in Leaderboard
CodexConvert keeps a local benchmark dataset and generates rankings like:
Rank Model Avg Score
🥇 GPT-4o 9.1
🥈 DeepSeek 8.8
🥉 Mistral 8.4
You can also see which models perform best for specific language migrations.
🧠 Modern Workspace UI
The interface works like a developer dashboard:
Inputs | Model Outputs | Benchmark Insights
You can upload an entire codebase, run conversions, and analyze results in one place.
🔒 Privacy-First Architecture
One important design decision:
CodexConvert has no backend server.
Everything happens in your browser:
• API keys stay in session storage
• code is sent directly to the AI provider
• nothing is stored remotely
🧩 Tech Stack
React + TypeScript
Vite
Tailwind CSS
JSZip
OpenAI-compatible API providers
💡 Why I Built This
Developers constantly ask questions like:
Which AI model is best for Python → Rust?
Which model produces cleaner TypeScript?
Which one is most token-efficient?
CodexConvert helps answer those questions.
🔗 GitHub
If you’d like to try it out or contribute:
👉 https://github.com/aryanjsx/Openclaude
Feedback is very welcome.
I’m especially interested in ideas for:
• better benchmarking metrics
• additional model providers
• new leaderboard visualizations
Thanks for reading 🙌
Top comments (0)