DEV Community

Cover image for How to Test the ChatGPT API with Apidog: Auth, Streaming, Tools, and CI
Hassann
Hassann

Posted on • Originally published at apidog.com

How to Test the ChatGPT API with Apidog: Auth, Streaming, Tools, and CI

The ChatGPT API changes quickly: request/response contracts shift, streaming fails differently from non-streaming, function calling adds JSON Schema validation, and every bad test still costs tokens. If you debug only with a Python REPL or a curl loop, you spend time and money finding issues that should be caught before production.

Try Apidog today

This guide shows how to build a reusable ChatGPT API test workflow in Apidog: authentication, basic chat completions, streaming SSE, tool calls, error handling, rate-limit checks, mocks, and CI scenarios. The goal is a project you can run before every prompt or model change.

TL;DR

  • Add https://api.openai.com/v1 as an Apidog environment.
  • Store OPENAI_API_KEY as a secret variable.
  • Apply Bearer auth at the folder level.
  • Save one /chat/completions request and reuse it across models.
  • Test streaming responses in Apidog’s SSE response view.
  • Validate tool_calls arguments against your expected schema.
  • Mock ChatGPT responses for frontend development.
  • Save the full workflow as a test scenario and run it in CI.

Why test the ChatGPT API?

OpenAI’s API surface can change across models, endpoints, and response modes. Common sources of regressions include:

  • function_call vs. tool_calls
  • Strict mode for tool schemas
  • Reasoning models such as o1 and o3 rejecting parameters like temperature
  • response_format: { "type": "json_schema" }
  • Streaming tool-call deltas arriving in partial chunks
  • /v1/responses overlapping with /v1/chat/completions

If your app calls the API directly without a contract test layer, prompt changes can break production behavior silently. An Apidog project gives you a repeatable request collection: replay the same request, inspect the response, assert the shape, and fail early when the contract changes.

Step 1: Create an OpenAI environment in Apidog

Create a new project in Apidog. Open Environment Management and add an environment named OpenAI Prod.

Variable Value
baseUrl https://api.openai.com/v1
OPENAI_API_KEY sk-proj-... stored as Secret
defaultModel gpt-5.5

Mark OPENAI_API_KEY as a secret. This keeps it masked in shared workspaces and prevents it from being written into exported collections. Teammates can use the same variable name while supplying their own key.

Step 2: Apply Bearer auth at the folder level

Create a folder named ChatGPT.

In the folder settings:

  1. Open Auth.
  2. Select Bearer Token.
  3. Set the token to:
{{OPENAI_API_KEY}}
Enter fullscreen mode Exit fullscreen mode

Every request in the folder now inherits the Authorization header.

This keeps request bodies clean and makes key rotation a single environment edit instead of a manual update across every request.

Step 3: Build a basic chat completion request

Inside the ChatGPT folder, create a request:

  • Method: POST
  • URL: {{baseUrl}}/chat/completions
  • Body type: JSON
{
  "model": "{{defaultModel}}",
  "messages": [
    {
      "role": "system",
      "content": "You are a senior backend engineer. Answer in under 100 words."
    },
    {
      "role": "user",
      "content": "What's the difference between idempotent and safe HTTP methods?"
    }
  ],
  "temperature": 0.2
}
Enter fullscreen mode Exit fullscreen mode

Click Send.

Expected response:

  • Status: 200
  • Answer: choices[0].message.content
  • Token usage: usage.total_tokens

Save the request as:

chat-completion-basic
Enter fullscreen mode Exit fullscreen mode

Troubleshooting:

  • 401: check that the selected environment is OpenAI Prod and the API key is set.
  • 429: you hit a rate limit. Add explicit rate-limit tests later in the workflow.
  • 404: check the model name.

Step 4: Test streaming responses with SSE

Streaming responses use text/event-stream, not normal JSON. Each event is a data: frame containing a partial delta.

Duplicate chat-completion-basic, rename it to:

chat-completion-stream
Enter fullscreen mode Exit fullscreen mode

Use this body:

{
  "model": "{{defaultModel}}",
  "stream": true,
  "messages": [
    {
      "role": "user",
      "content": "Stream the first 100 prime numbers, comma-separated."
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Send the request. Apidog displays the SSE stream as chunks arrive, so you can inspect the real frames instead of only the final assembled text.

Watch for these cases:

  • The stream ends with:
data: [DONE]
Enter fullscreen mode Exit fullscreen mode
  • If your client tries to parse [DONE] as JSON, it will throw.
  • Streaming responses do not include usage by default.
  • If you need token counts, add:
"stream_options": {
  "include_usage": true
}
Enter fullscreen mode Exit fullscreen mode
  • Tool-call deltas arrive in pieces and must be assembled by your client.

Step 5: Test function calling and tool use

Tool calling is where many prompt changes break downstream code. The model returns a tool_calls array, and your app must parse function.arguments as JSON.

Create a request named:

chat-completion-tools
Enter fullscreen mode Exit fullscreen mode

Use this body:

{
  "model": "{{defaultModel}}",
  "messages": [
    {
      "role": "user",
      "content": "What is the weather in Singapore right now?"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get current weather for a city.",
        "parameters": {
          "type": "object",
          "properties": {
            "city": {
              "type": "string"
            },
            "unit": {
              "type": "string",
              "enum": ["c", "f"]
            }
          },
          "required": ["city"]
        },
        "strict": true
      }
    }
  ],
  "tool_choice": "auto"
}
Enter fullscreen mode Exit fullscreen mode

A valid response should contain:

choices[0].message.tool_calls[0].function.name
Enter fullscreen mode Exit fullscreen mode

with the value:

get_weather
Enter fullscreen mode Exit fullscreen mode

The function.arguments field should be a JSON string that parses into an object similar to:

{
  "city": "Singapore",
  "unit": "c"
}
Enter fullscreen mode Exit fullscreen mode

Add tests in the request’s Tests tab:

pm.test("Tool was called", () => {
  const body = pm.response.json();
  const call = body.choices[0].message.tool_calls?.[0];

  pm.expect(call?.function?.name).to.eql("get_weather");
});

pm.test("Arguments parse as valid JSON", () => {
  const body = pm.response.json();
  const rawArgs = body.choices[0].message.tool_calls[0].function.arguments;
  const args = JSON.parse(rawArgs);

  pm.expect(args.city).to.be.a("string");
});
Enter fullscreen mode Exit fullscreen mode

Run the request. These tests now act as a contract for your tool-calling behavior.

Step 6: Add explicit error and rate-limit tests

Create separate requests for predictable production failures.

Scenario How to trigger Expected
Invalid key Set OPENAI_API_KEY to sk-bad in a Sandbox environment 401 with error.code = "invalid_api_key"
Rate limit Run the request repeatedly in Apidog’s collection runner 429 with Retry-After header
Token cap exceeded Send a prompt larger than the model context window 400 with error.code = "context_length_exceeded"
Bad model name Set "model": "gpt-99" 404
Schema violation Use strict: true and malformed tool input Model rejects the tool or returns plain text

Example test for a rate-limit response:

pm.test("Rate limit response is handled", () => {
  pm.expect(pm.response.code).to.eql(429);
  pm.expect(pm.response.headers.has("Retry-After")).to.eql(true);
});
Enter fullscreen mode Exit fullscreen mode

The Retry-After header is important. Read it from the response instead of hardcoding a fixed backoff. It is usually in seconds and may be fractional.

Step 7: Mock ChatGPT for frontend development

Frontend teams often need realistic responses before the final backend prompt is ready. Instead of spending OpenAI tokens during UI work, mock the API in Apidog.

In the ChatGPT folder:

  1. Right-click chat-completion-basic.
  2. Select Smart Mock.
  3. Enable the mock.

Apidog returns a synthetic response matching the OpenAI response shape, including:

  • id
  • object
  • created
  • model
  • choices
  • usage

The mock URL looks like:

https://mock.apidog.com/m1/<projectId>/chat/completions
Enter fullscreen mode Exit fullscreen mode

It accepts the same request body as the real endpoint.

For streaming mocks, define an Advanced Mock script that writes SSE chunks:

data: { ... }\n\n
Enter fullscreen mode Exit fullscreen mode

at a fixed interval, such as 50ms. This lets the frontend test token rendering, loading states, and stream termination without calling OpenAI.

When the real backend prompt is ready, switch the frontend base URL back to:

https://api.openai.com/v1
Enter fullscreen mode Exit fullscreen mode

The request body stays the same.

Step 8: Save everything as a CI test scenario

Use Apidog Test Scenarios to chain the requests and run them headlessly.

Build a scenario that:

  1. Calls chat-completion-basic

    • Assert status === 200
    • Assert usage.total_tokens > 0
  2. Calls chat-completion-stream

    • Assert the SSE stream finishes with [DONE]
  3. Calls chat-completion-tools

    • Assert the expected tool was called
    • Assert the arguments parse as valid JSON
  4. Calls each error scenario

    • Assert the expected status code
    • Assert important headers or error fields

Run the scenario in CI:

apidog-cli run scenario.json --env "OpenAI Prod"
Enter fullscreen mode Exit fullscreen mode

Wire this into the pull request pipeline for prompt changes. Each prompt update now runs against the live API before merge.

FAQ

Does this work with Azure OpenAI?

Yes. Change baseUrl to your Azure OpenAI resource URL, add the required api-version query parameter, and use the api-key header instead of Bearer auth. The request bodies are otherwise similar.

Can I use this for o1 and o3 reasoning models?

Yes, but create a separate folder for reasoning models. These models may reject parameters such as:

  • temperature
  • top_p
  • presence_penalty
  • frequency_penalty

Use a stripped-down request body for that folder.

How do I version prompts inside Apidog?

Use Apidog branches. Create one branch per prompt experiment, run the test scenario against the live API, compare output and token usage, then merge the branch that passes your checks.

What about /v1/responses?

Create a separate folder for /v1/responses. The auth and base URL stay the same, but the request body shape differs. Keeping both folders lets you compare behavior against the same prompts.

Does Apidog charge per API call?

No. OpenAI charges per token. Apidog does not sit between your application and OpenAI for these requests.

Wrap up

The ChatGPT API will keep changing. Streaming behavior, tool schemas, model parameters, and response shapes can all break application code. A reusable Apidog project gives you a controlled workflow: request collection, mocks, assertions, and CI execution.

Download Apidog and import your existing OpenAI calls. Build the requests above once, then run them before every prompt, model, or endpoint change.

Top comments (0)