DEV Community

Cover image for Chat Completions vs OpenAI Responses API: What Actually Changed
Vaishali
Vaishali

Posted on

Chat Completions vs OpenAI Responses API: What Actually Changed

While learning about structured outputs, I noticed something strange.
Almost every tutorial, course, and example I found was still using the Chat Completions API.

But the OpenAI documentation kept referencing something newer:
The Responses API.

At first I assumed it was just another wrapper around the same thing.
But the more I looked into it, the more it became clear:

The Responses API isn’t just a new endpoint.
It’s the direction OpenAI is pushing future AI applications.


πŸ€– A Quick Look at the Evolution

OpenAI APIs have gone through a few stages:

Completions API
↓
Chat Completions API
↓
Responses API
Enter fullscreen mode Exit fullscreen mode

Each step moved the API closer to something easier to use inside real applications..

  • Completions β†’ simple text generation
  • Chat Completions β†’ conversation format
  • Responses API β†’ full AI system interface

The Responses API doesn't just rename endpoints β€” it simplifies how AI systems handle conversations, tools, and structured data.

It was built for modern capabilities like reasoning models, tool usage, and structured outputs.

Several small changes in the API design make it noticeably easier to build real applications.


🧩 Simpler Requests and Cleaner Responses

With the Chat Completions API, prompts are structured as message arrays.

Example:

from openai import OpenAI
client = OpenAI()

completion = client.chat.completions.create(
  model="gpt-5",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ]
)

print(completion.choices[0].message.content)
Enter fullscreen mode Exit fullscreen mode

Two things stand out here:

  1. Requests require managing a messages array.
  2. Responses are nested inside a choices list.

Even when you only generate one response, you still have to access it like this:

completion.choices[0].message.content
Enter fullscreen mode Exit fullscreen mode

The Responses API simplifies both sides.

from openai import OpenAI
client = OpenAI()

response = client.responses.create(
   model="gpt-5",
   instructions="You are a helpful assistant.",
   input="Hello!"
)

print(response.output_text)
Enter fullscreen mode Exit fullscreen mode

Now:

  • requests use clearer fields like instructions and input
  • responses can be accessed directly with response.output_text

This removes unnecessary nesting and makes the API simpler to read and easier to work with.


πŸ” Handling Conversations Is Much Cleaner

With Chat Completions, you have to manually manage conversation history.

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"}
]

res1 = client.chat.completions.create(
    model="gpt-5",
    messages=messages
)

messages += [res1.choices[0].message]
messages += [{"role": "user", "content": "And its population?"}]

res2 = client.chat.completions.create(
    model="gpt-5",
    messages=messages
)
Enter fullscreen mode Exit fullscreen mode

Every response has to be manually appended to the message history.

The Responses API introduces a much cleaner approach.

res1 = client.responses.create(
    model="gpt-5",
    input="What is the capital of France?",
    store=True
)

res2 = client.responses.create(
    model="gpt-5",
    input="And its population?",
    previous_response_id=res1.id,
    store=True
)
Enter fullscreen mode Exit fullscreen mode

Here the API keeps track of context using:

previous_response_id
Enter fullscreen mode Exit fullscreen mode

Instead of passing the entire conversation again, the model can continue reasoning from the previous response.


βš™οΈ Structured Outputs Are Cleaner Too

In Chat Completions, structured outputs are defined with response_format.

response = client.chat.completions.create(
  model="gpt-5",
  messages=[{"role":"user","content":"Jane, 54 years old"}],
  response_format={           # <--- Important
    "type": "json_schema",
    "json_schema": {
      "name": "person",
      "schema": {
        "type": "object",
        "properties": {
          "name": {"type": "string"},
          "age": {"type": "number"}
        }
      }
    }
  }
)
Enter fullscreen mode Exit fullscreen mode

In the Responses API, this moves into a more intuitive structure:

response = client.responses.create(
  model="gpt-5",
  input="Jane, 54 years old",
  text={                       # <--- Important
    "format": {
      "type": "json_schema",
      "name": "person",
      "schema": {
        "type": "object",
        "properties": {
          "name": {"type": "string"},
          "age": {"type": "number"}
        }
      }
    }
  }
)
Enter fullscreen mode Exit fullscreen mode

This makes structured output feel like a native capability of the API, rather than an add-on.


πŸ›  Function Calling Is Simpler

Function calling also became cleaner in the Responses API.

In Chat Completions, functions are defined with an extra layer of nesting:

{
  "type": "function",
  "function": {
    "name": "get_weather",
    "description": "Determine weather in my location",
    "parameters": {
      "type": "object",
      "properties": {
        "location": {
          "type": "string"
        }
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

The Responses API removes that unnecessary wrapper and simplifies the structure:

{
  "type": "function",
  "name": "get_weather",
  "description": "Determine weather in my location",
  "parameters": {
    "type": "object",
    "properties": {
      "location": {
        "type": "string"
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

The schema now lives directly inside the tool definition itself, which makes function definitions easier to read and maintain.

Another small but important difference:

  • Chat Completions functions are non-strict by default
  • Responses API functions are strict by default

This means the model is more likely to follow the defined schema without extra validation logic.


🧠 Built-in Tool Usage

Another major difference is native tool support.

With Chat Completions, developers typically had to define and manage tools themselves.

Example:

def web_search(query):
    r = requests.get(f"https://api.example.com/search?q={query}")
    return r.json().get("results", [])

functions=[
  {
    "name": "web_search",
    "description": "Search the web",
    "parameters": {
      "type": "object",
      "properties": {
        "query": {"type": "string"}
      }
    }
  }
]
Enter fullscreen mode Exit fullscreen mode

The Responses API introduces built-in tools that can be used directly.

Some examples available on the OpenAI platform include:

  • Web search
  • File search
  • Image generation
  • Code interpreter
  • Remote MCP servers
  • Skills

Instead of implementing these manually, you can simply specify the tool you want to use.

answer = client.responses.create(
    model="gpt-5",
    input="Who is the current president of France?",
    tools=[{"type": "web_search_preview"}]
)

print(answer.output_text)
Enter fullscreen mode Exit fullscreen mode

The model can now use the tool inside the same request, making it easier to build tool-powered AI applications.


πŸ“ˆ Other Improvements

The Responses API also introduces several practical improvements:

  • Better performance with reasoning models
  • Lower costs through improved caching
  • Stateful context between requests
  • Built-in tool integrations
  • Future compatibility with upcoming models

These changes make it easier to build agent-like workflows without complex orchestration logic.


πŸ” So Should You Still Use Chat Completions?

Chat Completions still works and is widely used.

But OpenAI is clearly designing new models and features around the Responses API.

For new projects, the newer API often provides:

  • simpler requests
  • cleaner structured outputs
  • built-in tool support
  • better context management

🌱 The Takeaway

At first glance, the Responses API might look like a small change.
But it represents something bigger.

Earlier APIs treated LLMs like chat interfaces.

The Responses API treats them more like programmable systems β€” capable of reasoning, using tools, and maintaining context.

And that subtle change makes building AI systems much easier.

Top comments (0)