AI tooling has exploded. There are chat interfaces, API playgrounds, model routers, and local runners. Most of which do one thing well and nothing else.
If you want to compare different AI models or different AI model providers, test the same prompt across five providers, switch between image generation and text in the same session, or give a model access to real-time web search, you'd be juggling across three or four different apps to do it.
I created Apiction to fix that. It is a browser-based, model- and provider-agnostic AI workspace.
You bring your own API keys, and it handles the rest. Conversations, tools, images, summaries, debugging for developers, and more. Everything is stored in your browser and you can run it locally.
This article walks through what you can do with it, both as a general user and as a developer.
For Everyone
1. General Text Conversations
The most obvious use is chatting with AI models. You type a message, get a response, continue the conversation. Nothing groundbreaking there.
What is a bit different is you can run conversations in separate tabs with their own separate settings and memory handling. More on that in a bit.
Apiction renders Markdown responses properly. It renders Math equations via KaTeX, code blocks with syntax highlighting, so you can have nicely formatted responses without needing to open them in another app. Just make sure to set the system prompt to ask the model to respond in Markdown format, and it will do the rest.
Streaming is supported. So responses appear word by word as they generate, however the rendering during stream is kept lightweight and the full Markdown render only happens when the complete response is received.
2. Image Generation
Apiction supports image generation from any OpenAI-compatible platform. For some other platforms, like Pollinations (though their recent updates now enable image generation via OpenAI-compatible endpoints), the image generation is supported via custom adapters.
You switch the model type in the tab settings to text-to-image, configure your endpoint and API key, and start generating.
3. Web Search and Tool Calling via MCP Servers
Apiction integrates with the Model Context Protocol (MCP), a standard for connecting AI models to external tools and data sources via JSON-RPC.
In practice, this means you can point a tab at a locally running MCP server (a web search tool, a database query tool, a file reader, whatever you have set up) and the AI model will use those tools autonomously when it decides they are necessary.
It discovers the available tools at the start of a conversation, injects them into the request, and then executes the tool calls the model makes, up to five rounds of back-and-forth tool execution per message, before returning the final response to you.
As indicated previously, each tab has its own tool configuration, so one tab can have web search enabled while another has none. The model decides when to invoke a tool and when to just respond directly.
4. Converse With Any AI Model From Any Provider
This is arguably the core premise of Apiction. It natively supports OpenAI, OpenRouter, HuggingFace, Together.ai, Groq, Pollinations, Fireworks.ai, and anything that speaks the OpenAI-compatible API format, which covers a large chunk of the ecosystem.
For providers that do not follow that format, some providers are still supported via custom adapters and in future releases, more adapters will be added to cover other edge platforms as well.
Apiction's internal architecture is built this way. The core of the application knows nothing about OpenAI or Anthropic or HuggingFace. All provider-specific logic lives in isolated adapters.
When you configure an endpoint in a tab, the adapter selection is automatic based on the URL. This means adding a new provider is a matter of writing one file without touching anything else in the codebase.
For end users, this simply means: if a new model drops somewhere, you can use it immediately, as long as you have the API key and endpoint.
5. Rolling Summary for Long Conversations
Token limits are a real constraint. A long conversation will eventually exceed the context window of whatever model you are using.
Apiction handles this with a rolling summarization system. After every set number of messages in a tab, the older portion of the conversation is automatically summarized using a summarization model for that tab.
When building context for the next request, that summary is injected as a system message, so the model retains the context of what happened earlier without needing the raw message history.
The summaries are cumulative, each new batch is merged into the existing summary rather than replacing it, so context is preserved across multiple summarization passes as well as there exists only one combined summary. The summarization model is configurable per tab, so you can choose a smaller, cheaper model for summarization and a larger one for conversation if you want.
For Developers
6. Raw Request and Response Debug Console
Every tab has a debug mode that, when enabled, shows you the raw request payload sent to the provider and the raw response received back. This can be valuable if you are-
- Debugging a custom adapter or a new provider integration
- Verifying that your system prompt and message history are being formatted correctly
- Checking if tool call data is making it into the request payload
It is per-tab and off by default, so it does not clutter the interface during normal use.
7. Performance Stats: TTFT, Total Time, Token Counts
Below each response, when debug mode is enabled, you get:
- Time to First Token (TTFT) — how long from sending the request until the first token starts streaming back
- Total response time — wall clock time from request to last token
- Input and output token counts — pulled from the usage field of the response
TTFT is a meaningful metric for streaming models. Two providers serving the same model can have wildly different TTFT values depending on their infrastructure, load, and whether they are doing any preprocessing on your request. Total time is useful, but TTFT tells you more about latency.
8. Cross-Provider Model Comparison
This is the most useful developer feature. If you are building an application on top of a specific model and relying on a certain level of reasoning quality, the provider you choose matters.
In the Apiction, because each tab is independently configured, you can open multiple tabs, point them at different providers, set the same model on each, and send them the same prompt. You can then compare:
- Response quality
- Response time and TTFT
- How the model behaves differently (or identically) across providers
For example, you can test and "gauge" if any provider is running a quantized model than the one advertised. Quantized models can be significantly cheaper and faster, but they often have reduced reasoning capabilities and can make more mistakes, especially on complex tasks.
You can surface this with some targeted prompts. Ask the model to perform a precise multi-step arithmetic calculation or a logically dense reasoning problem. A full-precision model and a heavily quantized version of the same model will often diverge on these tasks or diverge in their confidence and the details of their reasoning.
If two providers claim to be serving the same model but the responses differ significantly on a factual or mathematical problem, that can be a signal.
A Note on Privacy and Setup
All of your conversations, settings, and summaries are stored in your browser's IndexedDB.
The PHP backend it ships with is only there because many AI provider APIs block direct browser requests due to CORS. The backend is a thin proxy, it forwards your request, gets the response, and returns it. If a provider supports CORS directly, like Openrouter, you can enable direct requests in settings and skip the proxy entirely.
You can also run it locally with just php -S localhost:8000 -t public_html if you have PHP available, or use it via apiction.site if you want to skip the setup.
Conclusion
Apiction is for people who use AI models via different API providers, test new models, build on top of APIs, or just want more visibility into what is actually happening under the hood when they send a message. My intention was to build something like Postman but for AI APIs, and which can provide general usage features as well.
If you're spending a meaningful amount of time working with AI APIs, it is worth having a workspace that keeps up.
Top comments (0)