Hassann

Posted on May 9 • Originally published at apidog.com

Get Free Unlimited GPT-5.5 API and All OpenAI Models

OpenAI’s GPT-5.5 API pricing ($5 per million input tokens, $30 per million output tokens) can block side projects, hackathon apps, and free public tools before they ship. Puter.js offers a browser-first workaround: it exposes OpenAI models such as GPT-5.5, GPT-5.5 Pro, GPT-5.x variants, GPT-Image-2, DALL-E, and OpenAI TTS without requiring your OpenAI API key. Instead of billing you, usage is charged to the signed-in Puter end user.

Try Apidog today

TL;DR

Use Puter.js when you want OpenAI access in the browser without managing an OpenAI account, API key, backend, or billing.
Text models include gpt-5.5, gpt-5.5-pro, gpt-5.4, gpt-5, gpt-5-mini, o1, o3, gpt-4.1, gpt-4o, and chat/codex variants.
Image models include gpt-image-2, gpt-image-1.5, dall-e-3.
TTS models include gpt-4o-mini-tts, tts-1, tts-1-hd.
Add one <script> tag, call puter.ai.chat(), and you can run GPT-5.5 from a browser page.
Streaming, function calling, vision input, image generation, and text-to-speech are available from the browser.
The end user covers usage through their Puter account; your app does not receive OpenAI invoices.
Use Apidog to compare Puter-based prototypes with the official OpenAI API before migration.

How Puter’s “free unlimited” model works

Puter.js changes who pays for LLM usage.

In a standard OpenAI integration:

You create an OpenAI account.
You store an API key.
Your app sends requests to OpenAI.
You pay for all user usage.

With Puter:

Your app loads Puter.js in the browser.
The user signs in to Puter.
Your app calls OpenAI-compatible models through Puter.
Usage is charged to the user’s Puter balance.

For developers, this means:

No OpenAI key in your repo
No token bill attached to your account
No server required for browser apps
No per-developer usage cap

The trade-off: Puter is browser-first. If you need cron jobs, webhook handlers, background workers, or backend-only automation, use the official OpenAI API.

Step 1: Install Puter.js

For a plain HTML page, add the CDN script:

<script src="https://js.puter.com/v2/"></script>

That is enough for static sites, prototypes, browser extensions, and hackathon demos.

For a bundled JavaScript app, install the package:

npm install @heyputer/puter.js

Then import it:

import { puter } from '@heyputer/puter.js';

Use the CDN when you want the fastest possible setup. Use the npm package when you want bundler support and TypeScript types.

Step 2: Choose a model

Puter exposes GPT-5.x models and older OpenAI models. Pick the smallest model that meets your quality requirements.

Model ID	Use case
`gpt-5.5-pro`	Hard reasoning, coding agents, complex analysis
`gpt-5.5`	Default model for general chat and reasoning
`gpt-5.4-nano`	Fast, low-cost classification or extraction
`gpt-5.4-mini`	Chat UIs and mid-complexity tasks
`gpt-5.3-codex`	Code-focused workflows
`o3`	Complex reasoning chains
`o1-pro`	Agentic multi-step planning
`gpt-4.1`, `gpt-4o`, `gpt-4o-mini`	Stable baseline models

For image generation:

gpt-image-2: latest image model
gpt-image-1.5, gpt-image-1, dall-e-3, dall-e-2: older stable options

For text-to-speech:

gpt-4o-mini-tts: newer TTS model
tts-1, tts-1-hd: classic TTS options

Step 3: Call GPT-5.5 from the browser

Create an index.html file:

<!DOCTYPE html>
<html>
<body>
  <script src="https://js.puter.com/v2/"></script>

  <script>
    puter.ai.chat(
      "Explain WebSockets in three sentences",
      { model: "gpt-5.5" }
    ).then(response => {
      puter.print(response);
    });
  </script>
</body>
</html>

Open the file in a browser.

Puter handles authentication and the model request. On first use, the user signs in or creates a Puter account. You do not need an OpenAI key, .env file, proxy server, or backend route.

Step 4: Stream responses for chat UIs

For long answers, stream tokens instead of waiting for the full response:

const response = await puter.ai.chat(
  "Explain the theory of relativity in detail",
  {
    model: "gpt-5.5",
    stream: true
  }
);

for await (const part of response) {
  puter.print(part?.text);
}

In a real UI, append each chunk to the current assistant message:

const output = document.querySelector("#output");

for await (const part of response) {
  output.textContent += part?.text ?? "";
}

Use streaming for:

Chatbots
Documentation assistants
Long-form explanations
Code generation
Any UX where users should see progress immediately

Step 5: Send image input to a vision model

Pass the prompt, image URL, and model options:

puter.ai.chat(
  "What do you see in this image? Describe colors, objects, and mood.",
  "https://assets.puter.site/doge.jpeg",
  { model: "gpt-5.5" }
).then(response => {
  puter.print(response);
});

Use vision input for:

Alt-text generation
Screenshot analysis
Visual QA
OCR-like workflows
Accessibility tooling
Product image inspection

This works with GPT-5.x models and GPT-4o variants.

Step 6: Generate images

Use puter.ai.txt2img():

puter.ai.txt2img(
  "A futuristic cityscape at night, cinematic, neon, rain",
  { model: "gpt-image-2" }
).then(imageElement => {
  document.body.appendChild(imageElement);
});

txt2img() returns an <img> element that you can insert directly into the DOM.

Example with a basic UI:

<input id="prompt" placeholder="Describe an image..." />
<button id="generate">Generate</button>
<div id="result"></div>

<script src="https://js.puter.com/v2/"></script>
<script>
  document.querySelector("#generate").addEventListener("click", async () => {
    const prompt = document.querySelector("#prompt").value;
    const result = document.querySelector("#result");

    result.textContent = "Generating...";

    const image = await puter.ai.txt2img(prompt, {
      model: "gpt-image-2"
    });

    result.innerHTML = "";
    result.appendChild(image);
  });
</script>

The user pays the image generation cost from their Puter account.

Step 7: Convert text to speech

Use puter.ai.txt2speech():

puter.ai.txt2speech(
  "Welcome back. Your account balance is $1,247.50.",
  {
    provider: "openai",
    model: "gpt-4o-mini-tts"
  }
).then(audio => {
  audio.setAttribute("controls", "");
  document.body.appendChild(audio);
});

The function returns an <audio> element.

Use it for:

Voice prompts
Accessibility narration
Product walkthroughs
App voiceovers
Podcast intros

Step 8: Add function calling

Puter supports the standard OpenAI-style tool definition shape.

Define your tools:

const tools = [
  {
    type: "function",
    function: {
      name: "get_weather",
      description: "Get the current weather for a city.",
      parameters: {
        type: "object",
        properties: {
          city: {
            type: "string"
          }
        },
        required: ["city"]
      }
    }
  }
];

Send the prompt with tools:

const response = await puter.ai.chat(
  "What's the weather in Tokyo right now?",
  {
    model: "gpt-5.5",
    tools
  }
);

Read the tool call:

const toolCalls = response.message.tool_calls;

if (toolCalls?.length) {
  const call = toolCalls[0];

  console.log("Function:", call.function.name);
  console.log("Arguments:", call.function.arguments);

  // Execute your function here.
}

The model emits the tool call. Your app is responsible for executing the function and sending the result back into the conversation.

For testing tool-driven flows in production-grade settings, see MCP server testing in Apidog.

Step 9: Tune `temperature` and `max_tokens`

Pass OpenAI-style parameters in the options object:

const response = await puter.ai.chat(
  "Tell me about Mars",
  {
    model: "gpt-5.5",
    temperature: 0.2,
    max_tokens: 200
  }
);

Recommended defaults:

const defaults = {
  model: "gpt-5.5",
  temperature: 0.2,
  max_tokens: 500
};

Use lower temperature for predictable output:

temperature: 0.0 // deterministic / factual
temperature: 0.2 // documentation, summaries, QA
temperature: 0.7 // creative writing
temperature: 1.0 // highly varied output

Use max_tokens to keep responses bounded and avoid unnecessary user-side cost.

What Puter gives you

Puter’s browser-first OpenAI access is useful when you want to ship quickly without handling billing.

You get:

GPT-5.x models, including GPT-5.5 and GPT-5.5 Pro
Older OpenAI models such as GPT-4.1, GPT-4o, o1, and o3
GPT-Image-2 and DALL-E image generation
OpenAI TTS models, including gpt-4o-mini-tts
Streaming
Vision input
Function calling
Temperature control
max_tokens

What Puter does not replace

Puter is not a full replacement for every official OpenAI API workflow.

You may not get:

Responses API support
Prompt caching cost controls
Files API support
Backend-only usage without a browser session
Direct OpenAI rate-limit headers
OpenAI structured output mode and JSON schema enforcement

Use the official OpenAI API when you need backend execution, compliance controls, structured outputs, prompt caching, or direct OpenAI account management.

When to use Puter vs the official OpenAI API

Use Puter when:

You are building a browser-based app.
You want to avoid OpenAI billing exposure.
You are prototyping and do not want to set up an OpenAI account.
You are building a static site, browser extension, or hackathon demo.
Your users are willing to sign in to Puter.

Use the official OpenAI API when:

You need server-side calls.
You need cron jobs, webhook handlers, queues, or batch processing.
You need prompt caching.
You need the Responses API, Files API, or structured outputs.
You need compliance terms such as BAAs, SOC 2, or residency guarantees.
Your users will not accept a Puter sign-in step.

Many projects can start with Puter, validate the product, then migrate to the official API when backend or compliance requirements appear.

For a paid production setup, see How to use the GPT-5.5 API.

Testing the integration in Apidog

Puter calls run in the browser, so you cannot test them like a normal backend API request. A practical setup is:

Create a static HTML page that loads Puter.js.
Accept the prompt from a query parameter.
Use the page as your puter-prototype test target.
Create a separate openai-prod environment for the official OpenAI API.
Keep both environments in the same Apidog collection for migration planning.

Example local Puter test page:

<!DOCTYPE html>
<html>
<body>
  <pre id="output">Loading...</pre>

  <script src="https://js.puter.com/v2/"></script>
  <script>
    const params = new URLSearchParams(window.location.search);
    const prompt = params.get("prompt") || "Say hello";

    const output = document.querySelector("#output");

    puter.ai.chat(prompt, {
      model: "gpt-5.5"
    }).then(response => {
      output.textContent = response;
    }).catch(error => {
      output.textContent = error.message;
    });
  </script>
</body>
</html>

Run it locally:

npx serve .

Then call it in the browser:

http://localhost:3000?prompt=Explain%20JWT%20in%20one%20paragraph

Use Apidog to model the official OpenAI request you may migrate to later.

Download Apidog and create two environments:

puter-prototype: your localhost page that runs Puter.js
openai-prod: https://api.openai.com/v1

For broader API testing patterns, see API testing tool for QA engineers.

FAQ

Is Puter unlimited for developers?

Yes. The developer does not pay for usage through their own OpenAI account. Usage is charged to the signed-in user’s Puter balance.

Do I need an OpenAI account?

No. Puter handles the OpenAI relationship. Your app does not need an OpenAI API key.

Can I use this in production?

Yes, for browser-based apps. The key product question is whether your users are willing to sign in to Puter.

Does GPT-5.5 through Puter behave the same as the official API?

The model output should come from the same OpenAI model because Puter calls OpenAI on the user’s behalf. Latency may differ because there is an extra layer between your app and OpenAI.

Does Puter support prompt caching?

Puter does not expose OpenAI prompt caching pricing controls today. If prompt caching is important for your workload, use the official OpenAI API.

Can I use Puter from a backend service?

Not cleanly. Puter is browser-first and assumes a user session. Backend services should use the official OpenAI API.

For free server-side options, see How to use the GPT-5.5 API for free.

What model should I start with?

Use:

gpt-5.5 for general chat and reasoning
gpt-5.4-nano for high-volume classification
gpt-5.5-pro for harder reasoning
o3 for long reasoning chains

Will users be charged a lot?

Most chat-style usage costs cents per session at OpenAI-style rates. Image generation is usually more expensive than text. Use max_tokens, avoid unnecessary regeneration, and make cost-producing actions explicit in the UI.

Can I generate images with Puter?

Yes. Use puter.ai.txt2img() with gpt-image-2 or DALL-E models. The user pays from their Puter balance.

For the official paid API guide, see How to use the GPT-Image-2 API.

Wrapping up

Puter.js is a practical way to add GPT-5.5, image generation, vision, function calling, streaming, and TTS to browser-based apps without managing an OpenAI key or paying for user traffic yourself.

Use Puter for prototypes, hackathon builds, static sites, browser extensions, and free public apps. Use the official OpenAI API for backend workloads, compliance requirements, prompt caching, the Responses API, Files API, or strict structured outputs.

Build and compare your requests in Apidog, test the migration path, and choose the integration model that fits your app.

DEV Community

Get Free Unlimited GPT-5.5 API and All OpenAI Models

TL;DR

How Puter’s “free unlimited” model works

Step 1: Install Puter.js

Step 2: Choose a model

Step 3: Call GPT-5.5 from the browser

Step 4: Stream responses for chat UIs

Step 5: Send image input to a vision model

Step 6: Generate images

Step 7: Convert text to speech

Step 8: Add function calling

Step 9: Tune `temperature` and `max_tokens`

What Puter gives you

What Puter does not replace

When to use Puter vs the official OpenAI API

Testing the integration in Apidog

FAQ

Wrapping up

Top comments (0)

TL;DR

How Puter’s “free unlimited” model works

Step 1: Install Puter.js

Step 2: Choose a model

Step 3: Call GPT-5.5 from the browser

Step 4: Stream responses for chat UIs

Step 5: Send image input to a vision model

Step 6: Generate images

Step 7: Convert text to speech

Step 8: Add function calling

Step 9: Tune temperature and max_tokens

What Puter gives you

What Puter does not replace

When to use Puter vs the official OpenAI API

Testing the integration in Apidog

FAQ

Wrapping up

Step 9: Tune `temperature` and `max_tokens`