Over summer, while on parental leave, I looked at all the LLM prompt management platforms to see what might have worked for a previous project at work, where we just stored prompts in a file in the repo. Not great.
I came away with two findings:
- The platforms are either way too complicated.
- Or they lock you into using their own LLMs rather than the latest ones from OpenAI, Google, etc.
Neither of these is a compromise I am willing to make when choosing a prompt and analytics platform, I need something that doesn't force me into a single model and is quick to get started with.
So I spent my somewhat limited free time whilst on parental leave building out a prompt management and testing platform that focuses only on 2 principles:
how can I compare prompts to find the most effective for my goal.
how can I do the above in the most simple way possible, shielding users from statistic overload
The Answer
Testune is a simple LLM platform to manage and compare prompts. Three things make it practical:
Minimal API. Just three endpoints: create-thread, send-message, rate-message. Nothing else to learn.
Bring your own keys. You control spend and can switch providers at will. Tokens never leave your account.
Analytics done for you. Feedback is aggregated with internal LLM metrics to show which prompt performs better. Straight to the point, no analytical burden.
Check out my project here. Free for dev.to readers, no credit card required. I’d love your feedback, use the internal feedback tool or drop a comment below.
Top comments (0)