Hassann

Posted on May 9 • Originally published at apidog.com

Get Free Unlimited Claude Opus 4.7 API

Anthropic’s Claude models are strong choices for coding, agentic workflows, and long-context reasoning, but the official API cost can block small projects fast. Puter.js changes the billing model: you call Claude from the browser without an Anthropic API key, and usage is billed to the signed-in Puter user instead of your developer account.

Try Apidog today

This guide shows how to wire Claude into a browser app with Puter.js, choose a model, stream responses, maintain chat state, and understand when you should switch to the official Anthropic API.

TL;DR

Puter.js lets browser apps call Claude without an Anthropic API key, backend, or developer-side billing.
The end user signs in to Puter and covers their own usage.
Supported models include Opus 4.7, Opus 4.6, Opus 4.6 Fast, Opus 4.5, Opus 4.1, Opus 4, Sonnet 4.6, Sonnet 4.5, Sonnet 4, and Haiku 4.5.
Add one <script> tag, then call puter.ai.chat().
Streaming, system prompts, and multi-turn conversations are supported.
Use Apidog to benchmark prompts against the official Anthropic API when you plan a migration.

How the Puter billing model works

With the official Anthropic API, you usually do this:

Create an Anthropic account.
Store an API key.
Proxy requests through your backend.
Pay for every user’s tokens.

With Puter.js, the flow changes:

Your frontend loads Puter.js.
The user signs in to Puter.
Your app calls puter.ai.chat().
Usage is charged to the user’s Puter account.

For you as the developer, that means:

No API key in your repo
No Anthropic billing account required
No backend required for basic browser apps
No shared usage cap across your whole user base

The main constraint: Puter.js is browser-first. If you need cron jobs, backend workers, Discord bots, batch processing, or server-side API routes, use the official Anthropic API instead.

Step 1: Add Puter.js

For a static page or quick prototype, use the CDN script:

<script src="https://js.puter.com/v2/"></script>

A minimal HTML file looks like this:

<!DOCTYPE html>
<html>
  <body>
    <script src="https://js.puter.com/v2/"></script>
  </body>
</html>

If you are building with Vite, Webpack, or another bundler, install the package instead:

npm install @heyputer/puter.js

Then import it:

import { puter } from '@heyputer/puter.js';

Use the CDN for the fastest setup. Use the npm package when you want bundling, TypeScript support, or a production frontend build.

Step 2: Choose a Claude model

Puter exposes Claude models using Anthropic-style model IDs.

Model ID	When to use
`claude-opus-4-7`	Latest flagship; deepest reasoning and complex agentic work
`claude-opus-4-6`	Prior flagship; strong coding and reasoning
`claude-opus-4.6-fast`	Lower-latency Opus variant
`claude-opus-4-5`	Stable choice for production agents
`claude-opus-4-1`	Legacy stable option
`claude-opus-4`	Original Opus 4 baseline
`claude-sonnet-4-6`	Default daily driver for most apps
`claude-sonnet-4-5`	Prior Sonnet version; still useful for general tasks
`claude-sonnet-4`	Sonnet 4 baseline
`claude-haiku-4-5`	Fast option for classification and high-volume simple tasks

Practical defaults:

Use claude-sonnet-4-6 for most app features.
Use claude-haiku-4-5 for fast classification, tagging, routing, or lightweight summaries.
Use claude-opus-4-7 for complex code review, multi-step planning, and long-form reasoning.

Step 3: Make your first Claude call

Here is the smallest working example:

<!DOCTYPE html>
<html>
  <body>
    <script src="https://js.puter.com/v2/"></script>

    <script>
      puter.ai.chat(
        "Explain quantum computing in simple terms",
        { model: "claude-sonnet-4-6" }
      ).then(response => {
        puter.print(response.message.content[0].text);
      });
    </script>
  </body>
</html>

Open the file in a browser. Puter handles the call and prompts the user to sign in or create a Puter account if needed.

The response shape mirrors Anthropic’s message format:

response.message.content[0].text

For simple text responses, read the first content block. For more complex responses, iterate over all blocks:

for (const block of response.message.content) {
  if (block.type === "text") {
    console.log(block.text);
  }
}

Step 4: Stream long responses

For essays, code generation, and chat UIs, stream the response instead of waiting for the full answer.

const response = await puter.ai.chat(
  "Write a detailed essay on the impact of artificial intelligence on society",
  {
    model: "claude-sonnet-4-6",
    stream: true
  }
);

for await (const part of response) {
  puter.print(part?.text);
}

In a real chat UI, append each streamed chunk to the current message:

const output = document.querySelector("#assistant-message");

const stream = await puter.ai.chat(
  "Generate a checklist for securing an Express.js API",
  {
    model: "claude-sonnet-4-6",
    stream: true
  }
);

for await (const part of stream) {
  if (part?.text) {
    output.textContent += part.text;
  }
}

Example HTML:

<div id="assistant-message"></div>

Step 5: Build a multi-turn conversation

For chat, pass an array of messages instead of a single string.

const messages = [
  {
    role: "user",
    content: "I am building a Next.js app with Postgres."
  },
  {
    role: "assistant",
    content: "Got it. What do you need help with?"
  },
  {
    role: "user",
    content: "How should I structure the migrations folder?"
  }
];

const response = await puter.ai.chat(messages, {
  model: "claude-opus-4-7"
});

console.log(response.message.content[0].text);

To keep the conversation going, store the transcript and append each new turn:

const messages = [];

async function sendMessage(userText) {
  messages.push({
    role: "user",
    content: userText
  });

  const response = await puter.ai.chat(messages, {
    model: "claude-sonnet-4-6"
  });

  const assistantText = response.message.content[0].text;

  messages.push({
    role: "assistant",
    content: assistantText
  });

  return assistantText;
}

Claude reads the full message array on each call, so keep the transcript trimmed if your app has very long conversations.

Step 6: Add a system prompt

Use a system message to define behavior, tone, constraints, and output format.

const messages = [
  {
    role: "system",
    content: "You are a senior backend engineer. Reply in numbered bullets, never more than five."
  },
  {
    role: "user",
    content: "How do I prevent SQL injection in a Node app?"
  }
];

const response = await puter.ai.chat(messages, {
  model: "claude-sonnet-4-6"
});

console.log(response.message.content[0].text);

Good system prompts are specific:

const systemPrompt = `
You are a TypeScript code reviewer.
Focus on correctness, security, and maintainability.
Return:
1. Critical issues
2. Suggested improvements
3. A corrected code snippet when useful
Keep the answer concise.
`;

Then pass it at the top of the message list:

const messages = [
  { role: "system", content: systemPrompt },
  { role: "user", content: "Review this function: ..." }
];

Step 7: Compare models with the same prompt

The fastest way to pick a model is to run the same prompt across multiple Claude variants.

const models = [
  "claude-haiku-4-5",
  "claude-sonnet-4-6",
  "claude-opus-4-7"
];

const prompt = "Refactor this React component to use hooks: ...";

for (const model of models) {
  const start = performance.now();

  const response = await puter.ai.chat(prompt, { model });

  const elapsed = performance.now() - start;

  console.log(`${model}: ${elapsed.toFixed(0)}ms`);
  console.log(response.message.content[0].text);
  console.log("---");
}

You will usually see this pattern:

Haiku: fastest; best for simple and high-volume tasks.
Sonnet: best default for most app features.
Opus: strongest for difficult prompts, deeper reasoning, and complex code tasks.

To benchmark Puter’s browser path against the official Anthropic API in Apidog, keep both providers in the same collection and switch environments.

What you get with Puter.js

Puter.js gives you:

Claude model access from the browser
Multi-turn conversations
System prompts
Streaming responses
No developer-side API key
No developer-side Anthropic billing
Browser-first production deployment path

Depending on the current Puter version, you may not get every official Anthropic API feature, such as:

Native tool use / function calling
Vision input
Anthropic prompt caching controls
Server-side execution without a browser user session
Direct Anthropic rate-limit headers

For deeper tool workflows, the official Anthropic API or MCP server testing in Apidog gives you more control.

When to use Puter vs the official Anthropic API

Use Puter when:

You are building a browser-based app.
You do not want to manage an Anthropic API key.
You are shipping a free public tool and want to avoid developer-side billing exposure.
You are prototyping before committing to official API usage.
Your users can sign in to Puter.

Use the official Anthropic API when:

You need backend calls.
You need cron jobs, workers, or batch processing.
You need prompt caching controls.
You need advanced tool use, vision input, or Files API support.
You need compliance, contracts, or regional requirements.
Your users will not accept a Puter sign-in flow.

A common path is:

Prototype in the browser with Puter.
Validate prompts and UX.
Benchmark model behavior.
Migrate to the official Anthropic API when you need backend control.

The message shape is similar, so the migration is manageable.

For the GPT equivalent, see How to use the GPT-5.5 API.

Testing the integration in Apidog

Puter calls run in the browser, so you usually do not test them like backend API requests. A practical workflow is:

Create a small static page that loads Puter.js.
Accept the prompt through a query parameter or form input.
Use that page for browser-based Puter testing.
Use Apidog to test the official Anthropic API surface.
Keep both paths documented in the same project so migration is easier later.

Example static test page:

<!DOCTYPE html>
<html>
  <body>
    <pre id="output"></pre>

    <script src="https://js.puter.com/v2/"></script>
    <script>
      const params = new URLSearchParams(window.location.search);
      const prompt = params.get("prompt") || "Say hello from Claude.";

      const output = document.querySelector("#output");

      const response = await puter.ai.chat(prompt, {
        model: "claude-sonnet-4-6"
      });

      output.textContent = response.message.content[0].text;
    </script>
  </body>
</html>

Then run it locally and test prompts like:

http://localhost:5173/?prompt=Explain%20JWT%20authentication

Download Apidog and create two environments:

puter-prototype: your local static page that uses Puter.js
anthropic-prod: https://api.anthropic.com/v1

This lets you keep prompt tests, request examples, and migration notes in one place.

FAQ

Is this truly unlimited?

Unlimited from the developer side, yes. The end user has whatever balance is available in their Puter account. New Puter accounts include starter credit, and users can top up if they need more.

Do I need an Anthropic account?

No. Puter handles the Anthropic relationship. Your app does not need an Anthropic API key.

Can I use this in production?

Yes, for browser-based apps. The key product decision is whether your users are willing to sign in to Puter.

Does Claude through Puter behave the same as the official API?

The model output is expected to be the same because Puter calls Anthropic on the user’s behalf. Latency may be slightly different because Puter adds an extra layer between your app and Anthropic.

What about prompt caching?

Puter does not expose Anthropic’s prompt caching pricing controls today. If you rely on prompt caching for large stable prompts, use the official Anthropic API.

Can I use Puter for a Discord bot or backend service?

Not cleanly. Puter is browser-first and assumes a logged-in user session. For backend services, use the official Anthropic API.

Which model should I default to?

Use claude-sonnet-4-6 by default. Move to claude-opus-4-7 for harder reasoning tasks and claude-haiku-4-5 for fast, high-volume classification.

Will users be charged a lot?

Most chat-style usage costs cents per session at Anthropic-style rates. Casual users can run many conversations on starter credit before they need to top up.

Wrapping up

Puter.js is a practical way to add Claude to a browser app without managing Anthropic keys, billing, or backend infrastructure. Add the script, choose a model, call puter.ai.chat(), and let the signed-in user cover their own usage.

Use Puter for prototypes, hackathon projects, static sites, browser extensions, and free public apps. Use the official Anthropic API when you need backend execution, prompt caching, compliance controls, or advanced API features.

Build and benchmark your requests in Apidog, compare Puter with the official API, and choose the path that matches your deployment model.

DEV Community