OpenAI’s GPT-5.5 API pricing ($5 per million input tokens, $30 per million output tokens) can block side projects, hackathon apps, and free public tools before they ship. Puter.js offers a browser-first workaround: it exposes OpenAI models such as GPT-5.5, GPT-5.5 Pro, GPT-5.x variants, GPT-Image-2, DALL-E, and OpenAI TTS without requiring your OpenAI API key. Instead of billing you, usage is charged to the signed-in Puter end user.
TL;DR
- Use Puter.js when you want OpenAI access in the browser without managing an OpenAI account, API key, backend, or billing.
- Text models include gpt-5.5, gpt-5.5-pro, gpt-5.4, gpt-5, gpt-5-mini, o1, o3, gpt-4.1, gpt-4o, and chat/codex variants.
- Image models include gpt-image-2, gpt-image-1.5, dall-e-3.
- TTS models include gpt-4o-mini-tts, tts-1, tts-1-hd.
- Add one
<script>tag, callputer.ai.chat(), and you can run GPT-5.5 from a browser page. - Streaming, function calling, vision input, image generation, and text-to-speech are available from the browser.
- The end user covers usage through their Puter account; your app does not receive OpenAI invoices.
- Use Apidog to compare Puter-based prototypes with the official OpenAI API before migration.
How Puter’s “free unlimited” model works
Puter.js changes who pays for LLM usage.
In a standard OpenAI integration:
- You create an OpenAI account.
- You store an API key.
- Your app sends requests to OpenAI.
- You pay for all user usage.
With Puter:
- Your app loads Puter.js in the browser.
- The user signs in to Puter.
- Your app calls OpenAI-compatible models through Puter.
- Usage is charged to the user’s Puter balance.
For developers, this means:
- No OpenAI key in your repo
- No token bill attached to your account
- No server required for browser apps
- No per-developer usage cap
The trade-off: Puter is browser-first. If you need cron jobs, webhook handlers, background workers, or backend-only automation, use the official OpenAI API.
Step 1: Install Puter.js
For a plain HTML page, add the CDN script:
<script src="https://js.puter.com/v2/"></script>
That is enough for static sites, prototypes, browser extensions, and hackathon demos.
For a bundled JavaScript app, install the package:
npm install @heyputer/puter.js
Then import it:
import { puter } from '@heyputer/puter.js';
Use the CDN when you want the fastest possible setup. Use the npm package when you want bundler support and TypeScript types.
Step 2: Choose a model
Puter exposes GPT-5.x models and older OpenAI models. Pick the smallest model that meets your quality requirements.
| Model ID | Use case |
|---|---|
gpt-5.5-pro |
Hard reasoning, coding agents, complex analysis |
gpt-5.5 |
Default model for general chat and reasoning |
gpt-5.4-nano |
Fast, low-cost classification or extraction |
gpt-5.4-mini |
Chat UIs and mid-complexity tasks |
gpt-5.3-codex |
Code-focused workflows |
o3 |
Complex reasoning chains |
o1-pro |
Agentic multi-step planning |
gpt-4.1, gpt-4o, gpt-4o-mini
|
Stable baseline models |
For image generation:
-
gpt-image-2: latest image model -
gpt-image-1.5,gpt-image-1,dall-e-3,dall-e-2: older stable options
For text-to-speech:
-
gpt-4o-mini-tts: newer TTS model -
tts-1,tts-1-hd: classic TTS options
Step 3: Call GPT-5.5 from the browser
Create an index.html file:
<!DOCTYPE html>
<html>
<body>
<script src="https://js.puter.com/v2/"></script>
<script>
puter.ai.chat(
"Explain WebSockets in three sentences",
{ model: "gpt-5.5" }
).then(response => {
puter.print(response);
});
</script>
</body>
</html>
Open the file in a browser.
Puter handles authentication and the model request. On first use, the user signs in or creates a Puter account. You do not need an OpenAI key, .env file, proxy server, or backend route.
Step 4: Stream responses for chat UIs
For long answers, stream tokens instead of waiting for the full response:
const response = await puter.ai.chat(
"Explain the theory of relativity in detail",
{
model: "gpt-5.5",
stream: true
}
);
for await (const part of response) {
puter.print(part?.text);
}
In a real UI, append each chunk to the current assistant message:
const output = document.querySelector("#output");
for await (const part of response) {
output.textContent += part?.text ?? "";
}
Use streaming for:
- Chatbots
- Documentation assistants
- Long-form explanations
- Code generation
- Any UX where users should see progress immediately
Step 5: Send image input to a vision model
Pass the prompt, image URL, and model options:
puter.ai.chat(
"What do you see in this image? Describe colors, objects, and mood.",
"https://assets.puter.site/doge.jpeg",
{ model: "gpt-5.5" }
).then(response => {
puter.print(response);
});
Use vision input for:
- Alt-text generation
- Screenshot analysis
- Visual QA
- OCR-like workflows
- Accessibility tooling
- Product image inspection
This works with GPT-5.x models and GPT-4o variants.
Step 6: Generate images
Use puter.ai.txt2img():
puter.ai.txt2img(
"A futuristic cityscape at night, cinematic, neon, rain",
{ model: "gpt-image-2" }
).then(imageElement => {
document.body.appendChild(imageElement);
});
txt2img() returns an <img> element that you can insert directly into the DOM.
Example with a basic UI:
<input id="prompt" placeholder="Describe an image..." />
<button id="generate">Generate</button>
<div id="result"></div>
<script src="https://js.puter.com/v2/"></script>
<script>
document.querySelector("#generate").addEventListener("click", async () => {
const prompt = document.querySelector("#prompt").value;
const result = document.querySelector("#result");
result.textContent = "Generating...";
const image = await puter.ai.txt2img(prompt, {
model: "gpt-image-2"
});
result.innerHTML = "";
result.appendChild(image);
});
</script>
The user pays the image generation cost from their Puter account.
Step 7: Convert text to speech
Use puter.ai.txt2speech():
puter.ai.txt2speech(
"Welcome back. Your account balance is $1,247.50.",
{
provider: "openai",
model: "gpt-4o-mini-tts"
}
).then(audio => {
audio.setAttribute("controls", "");
document.body.appendChild(audio);
});
The function returns an <audio> element.
Use it for:
- Voice prompts
- Accessibility narration
- Product walkthroughs
- App voiceovers
- Podcast intros
Step 8: Add function calling
Puter supports the standard OpenAI-style tool definition shape.
Define your tools:
const tools = [
{
type: "function",
function: {
name: "get_weather",
description: "Get the current weather for a city.",
parameters: {
type: "object",
properties: {
city: {
type: "string"
}
},
required: ["city"]
}
}
}
];
Send the prompt with tools:
const response = await puter.ai.chat(
"What's the weather in Tokyo right now?",
{
model: "gpt-5.5",
tools
}
);
Read the tool call:
const toolCalls = response.message.tool_calls;
if (toolCalls?.length) {
const call = toolCalls[0];
console.log("Function:", call.function.name);
console.log("Arguments:", call.function.arguments);
// Execute your function here.
}
The model emits the tool call. Your app is responsible for executing the function and sending the result back into the conversation.
For testing tool-driven flows in production-grade settings, see MCP server testing in Apidog.
Step 9: Tune temperature and max_tokens
Pass OpenAI-style parameters in the options object:
const response = await puter.ai.chat(
"Tell me about Mars",
{
model: "gpt-5.5",
temperature: 0.2,
max_tokens: 200
}
);
Recommended defaults:
const defaults = {
model: "gpt-5.5",
temperature: 0.2,
max_tokens: 500
};
Use lower temperature for predictable output:
temperature: 0.0 // deterministic / factual
temperature: 0.2 // documentation, summaries, QA
temperature: 0.7 // creative writing
temperature: 1.0 // highly varied output
Use max_tokens to keep responses bounded and avoid unnecessary user-side cost.
What Puter gives you
Puter’s browser-first OpenAI access is useful when you want to ship quickly without handling billing.
You get:
- GPT-5.x models, including GPT-5.5 and GPT-5.5 Pro
- Older OpenAI models such as GPT-4.1, GPT-4o, o1, and o3
- GPT-Image-2 and DALL-E image generation
- OpenAI TTS models, including
gpt-4o-mini-tts - Streaming
- Vision input
- Function calling
- Temperature control
max_tokens
What Puter does not replace
Puter is not a full replacement for every official OpenAI API workflow.
You may not get:
- Responses API support
- Prompt caching cost controls
- Files API support
- Backend-only usage without a browser session
- Direct OpenAI rate-limit headers
- OpenAI structured output mode and JSON schema enforcement
Use the official OpenAI API when you need backend execution, compliance controls, structured outputs, prompt caching, or direct OpenAI account management.
When to use Puter vs the official OpenAI API
Use Puter when:
- You are building a browser-based app.
- You want to avoid OpenAI billing exposure.
- You are prototyping and do not want to set up an OpenAI account.
- You are building a static site, browser extension, or hackathon demo.
- Your users are willing to sign in to Puter.
Use the official OpenAI API when:
- You need server-side calls.
- You need cron jobs, webhook handlers, queues, or batch processing.
- You need prompt caching.
- You need the Responses API, Files API, or structured outputs.
- You need compliance terms such as BAAs, SOC 2, or residency guarantees.
- Your users will not accept a Puter sign-in step.
Many projects can start with Puter, validate the product, then migrate to the official API when backend or compliance requirements appear.
For a paid production setup, see How to use the GPT-5.5 API.
Testing the integration in Apidog
Puter calls run in the browser, so you cannot test them like a normal backend API request. A practical setup is:
- Create a static HTML page that loads Puter.js.
- Accept the prompt from a query parameter.
- Use the page as your
puter-prototypetest target. - Create a separate
openai-prodenvironment for the official OpenAI API. - Keep both environments in the same Apidog collection for migration planning.
Example local Puter test page:
<!DOCTYPE html>
<html>
<body>
<pre id="output">Loading...</pre>
<script src="https://js.puter.com/v2/"></script>
<script>
const params = new URLSearchParams(window.location.search);
const prompt = params.get("prompt") || "Say hello";
const output = document.querySelector("#output");
puter.ai.chat(prompt, {
model: "gpt-5.5"
}).then(response => {
output.textContent = response;
}).catch(error => {
output.textContent = error.message;
});
</script>
</body>
</html>
Run it locally:
npx serve .
Then call it in the browser:
http://localhost:3000?prompt=Explain%20JWT%20in%20one%20paragraph
Use Apidog to model the official OpenAI request you may migrate to later.
Download Apidog and create two environments:
-
puter-prototype: your localhost page that runs Puter.js -
openai-prod:https://api.openai.com/v1
For broader API testing patterns, see API testing tool for QA engineers.
FAQ
Is Puter unlimited for developers?
Yes. The developer does not pay for usage through their own OpenAI account. Usage is charged to the signed-in user’s Puter balance.
Do I need an OpenAI account?
No. Puter handles the OpenAI relationship. Your app does not need an OpenAI API key.
Can I use this in production?
Yes, for browser-based apps. The key product question is whether your users are willing to sign in to Puter.
Does GPT-5.5 through Puter behave the same as the official API?
The model output should come from the same OpenAI model because Puter calls OpenAI on the user’s behalf. Latency may differ because there is an extra layer between your app and OpenAI.
Does Puter support prompt caching?
Puter does not expose OpenAI prompt caching pricing controls today. If prompt caching is important for your workload, use the official OpenAI API.
Can I use Puter from a backend service?
Not cleanly. Puter is browser-first and assumes a user session. Backend services should use the official OpenAI API.
For free server-side options, see How to use the GPT-5.5 API for free.
What model should I start with?
Use:
-
gpt-5.5for general chat and reasoning -
gpt-5.4-nanofor high-volume classification -
gpt-5.5-profor harder reasoning -
o3for long reasoning chains
Will users be charged a lot?
Most chat-style usage costs cents per session at OpenAI-style rates. Image generation is usually more expensive than text. Use max_tokens, avoid unnecessary regeneration, and make cost-producing actions explicit in the UI.
Can I generate images with Puter?
Yes. Use puter.ai.txt2img() with gpt-image-2 or DALL-E models. The user pays from their Puter balance.
For the official paid API guide, see How to use the GPT-Image-2 API.
Wrapping up
Puter.js is a practical way to add GPT-5.5, image generation, vision, function calling, streaming, and TTS to browser-based apps without managing an OpenAI key or paying for user traffic yourself.
Use Puter for prototypes, hackathon builds, static sites, browser extensions, and free public apps. Use the official OpenAI API for backend workloads, compliance requirements, prompt caching, the Responses API, Files API, or strict structured outputs.
Build and compare your requests in Apidog, test the migration path, and choose the integration model that fits your app.

Top comments (0)