Ash Inno

Posted on Mar 6

GPT-5.4 Complete Guide: What's New, API Access, and How to Use It

#programming #ai #chatgpt #productivity

OpenAI just released GPT-5.4, and it's a significant leap forward. The new model delivers 83% win rates against industry professionals on knowledge work, uses 47% fewer tokens in tool-heavy workflows, and introduces native computer use capabilities that surpass human performance on certain benchmarks.

This guide combines everything you need to know: what GPT-5.4 is, how to access the API, and how to use it in your applications with working code examples.

What Is GPT-5.4?

GPT-5.4 is OpenAI's most advanced frontier model for professional work. It combines the coding excellence of GPT-5.3-Codex with enhanced reasoning, computer use, and tool integration into a single model.

Key Improvements Over GPT-5.2

1. Factual Accuracy: False claims dropped 33% at the individual claim level. Full responses contain 18% fewer errors overall.

2. Token Efficiency: GPT-5.4 uses significantly fewer tokens to solve problems. In tool-heavy workflows with MCP Atlas benchmarks, token usage dropped 47% while maintaining accuracy.

3. Computer Use Capabilities: First general-purpose OpenAI model with native computer use:

Issues mouse and keyboard commands from screenshots
Automates browsers via Playwright
Navigates desktop environments through coordinate-based interactions
Achieves 75% success rate on OSWorld-Verified, surpassing human performance at 72.4%

4. Tool Search: Eliminates the need to load thousands of tool definitions into every request. The model looks up tool definitions on-demand.

Performance Benchmarks

Benchmark	GPT-5.4	GPT-5.2
GDPval (knowledge work)	83.0%	70.9%
SWE-Bench Pro (coding)	57.7%	55.6%
OSWorld-Verified (computer use)	75.0%	47.3%
BrowseComp (web research)	82.7%	65.8%

Pricing

Model	Input Price	Output Price
GPT-5.4	$2.50/M tokens	$15/M tokens
GPT-5.4 Pro	$30/M tokens	$180/M tokens

Batch and Flex pricing available at 50% discount.

How to Access GPT-5.4 API

Getting started with GPT-5.4 API takes about 10-15 minutes. Here's the step-by-step process:

Step 1: Create OpenAI Account

Navigate to platform.openai.com and sign up. You'll need:

Email address
Password (minimum 8 characters)
Full name
Phone number for verification

Step 2: Set Up Billing

GPT-5.4 API uses pay-as-you-go pricing:

Go to Settings > Billing
Add payment method (Visa, Mastercard, or Amex)
OpenAI performs a small authorization charge ($0.50-1.00) to verify the card

New accounts start with $5 credit (expires after 3 months) and $5/month usage limit. After first payment, limit increases to $120/month automatically.

Step 3: Generate API Key

Navigate to platform.openai.com/api-keys
Click "Create new secret key"
Enter a descriptive name (e.g., "Development", "Production")
Copy the key immediately - you cannot view it again

Key format: sk-proj- followed by alphanumeric string.

Step 4: Install OpenAI SDK

Python:

pip install openai

Node.js:

npm install openai

Step 5: Configure Environment

Store your API key in environment variables:

export OPENAI_API_KEY="sk-proj-abc123def456..."

Step 6: Make Your First Request

Python Example:

from openai import OpenAI
import os

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

response = client.chat.completions.create(
    model="gpt-5.4",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is GPT-5.4?"}
    ]
)

print(response.choices[0].message.content)

Node.js Example:

const OpenAI = require('openai');

const client = new OpenAI({
    apiKey: process.env.OPENAI_API_KEY
});

async function main() {
    const response = await client.chat.completions.create({
        model: 'gpt-5.4',
        messages: [
            { role: 'system', content: 'You are a helpful assistant.' },
            { role: 'user', content: 'What is GPT-5.4?' }
        ]
    });

    console.log(response.choices[0].message.content);
}

main();

Rate Limits

Default Tier 2 limits (after first payment):

60 requests per minute
150,000 tokens per minute
1,000,000 tokens per day

How to Use GPT-5.4 API

Now let's dive into the advanced capabilities that make GPT-5.4 unique.

Computer Use API

GPT-5.4 can operate computers through screenshots, mouse commands, and keyboard input. This is useful for browser automation, data entry across applications, and testing workflows.

Basic Computer Use Setup:

from openai import OpenAI
import base64
import pyautogui
import io

client = OpenAI()

def take_screenshot():
    screenshot = pyautogui.screenshot()
    buffer = io.BytesIO()
    screenshot.save(buffer, format='PNG')
    return base64.b64encode(buffer.getvalue()).decode()

def execute_command(command):
    action = command.get('action')

    if action == 'click':
        x, y = command.get('coordinate', [0, 0])
        pyautogui.click(x, y)
    elif action == 'type':
        pyautogui.write(command.get('text', ''), interval=0.05)
    elif action == 'keypress':
        pyautogui.press(command.get('key', ''))

    return take_screenshot()

# Start computer use workflow
screenshot = take_screenshot()

messages = [{
    "role": "user",
    "content": [
        {"type": "text", "text": "Navigate to gmail.com and check unread emails."},
        {"type": "image_url", "image_url": {"url": f"data:image/png;base64,{screenshot}"}}
    ]
}]

response = client.chat.completions.create(
    model="gpt-5.4",
    messages=messages,
    tools=[{
        "type": "computer",
        "display_width": 1920,
        "display_height": 1080
    }],
    tool_choice="required"
)

# Parse and execute computer commands
for tool_call in response.choices[0].message.tool_calls:
    if tool_call.type == "computer":
        command = json.loads(tool_call.function.arguments)
        new_screenshot = execute_command(command)
        # Continue loop with new screenshot

Tool Search and Integration

Tool search reduces token usage by 47% by loading tool definitions on-demand instead of upfront.

# Define available tools (lightweight list)
available_tools = [
    {"name": "get_weather", "description": "Get current weather for a location"},
    {"name": "send_email", "description": "Send an email to a recipient"},
    {"name": "calendar_search", "description": "Search calendar for events"}
]

response = client.chat.completions.create(
    model="gpt-5.4",
    messages=[
        {"role": "user", "content": "What's the weather in Tokyo and send it to my team?"}
    ],
    tools=available_tools,
    tool_choice="auto"
)

Vision and Image Processing

GPT-5.4 supports high-resolution image processing with original detail level up to 10.24M pixels.

response = client.chat.completions.create(
    model="gpt-5.4",
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "image_url",
                "image_url": {
                    "url": "https://example.com/image.jpg",
                    "detail": "original"  # or "high" or "low"
                }
            },
            {"type": "text", "text": "Analyze this technical diagram."}
        ]
    }]
)

Long Context Workflows

GPT-5.4 supports up to 1M token context windows (experimental).

# Standard context (272K tokens)
with open('large_codebase.py', 'r') as f:
    code = f.read()

response = client.chat.completions.create(
    model="gpt-5.4",
    messages=[
        {"role": "user", "content": f"Review this codebase:\n{code}"}
    ],
    max_tokens=4000
)

# Extended context (1M tokens) - experimental
response = client.chat.completions.create(
    model="gpt-5.4",
    messages=[{"role": "user", "content": large_document}],
    extra_body={
        "model_context_window": 1048576,
        "model_auto_compact_token_limit": 272000
    }
)

Streaming Responses

stream = client.chat.completions.create(
    model="gpt-5.4",
    messages=[{"role": "user", "content": "Write a detailed explanation."}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Error Handling and Retry Logic

from openai import RateLimitError
import time

def make_request_with_retry(messages, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model="gpt-5.4",
                messages=messages
            )
        except RateLimitError as e:
            if attempt == max_retries - 1:
                raise
            wait_time = 2 ** attempt
            time.sleep(wait_time)

response = make_request_with_retry([
    {"role": "user", "content": "Hello, GPT-5.4!"}
])

Development Workflow Tips

When integrating GPT-5.4 into applications, having solid testing and debugging workflows accelerates development. Here are some approaches that work well:

Test Before Coding: Before writing integration code, validate your API requests visually. Tools like Apidog let you configure requests with headers, authentication, and body parameters, then inspect responses and generate code snippets in Python, Node.js, or cURL. This helps you understand the API structure before implementing.

Environment Management: Use environment variables to manage different API keys across development, staging, and production. This keeps credentials separate from request definitions and makes switching between environments straightforward.

Automated Testing: Create test suites that cover success and error cases. Test authentication failures, rate limit handling, and response validation. Mock GPT-5.4 responses during frontend development to avoid token costs.

Documentation: Keep API documentation synchronized with implementation. Auto-generate docs from tested requests so they stay current as you add features.

Cost Optimization Strategies

1. Use Cached Inputs: Repeated system prompts cost 90% less ($0.25 vs $2.50 per million tokens).

2. Optimize Prompts: Shorter prompts mean fewer input tokens. Be direct and remove filler.

3. Limit Output Tokens: Set max_tokens parameter appropriately to prevent rambling responses.

4. Use Batch Processing: 50% discount for non-real-time workloads processed within 24 hours.

5. Cache Responses: For identical requests, cache responses to avoid redundant API calls.

Example Cost Calculation:

Processing 10,000 queries monthly:

Average input: 500 tokens per query
Average output: 200 tokens per response
Total: 5M input + 2M output tokens

Standard pricing: $12.50 + $30.00 = $42.50/month

With Batch pricing (50% off): $21.25/month

Conclusion

GPT-5.4 delivers measurable improvements across knowledge work, computer use, and coding tasks. The combination of reduced hallucinations (33% fewer false claims), improved token efficiency (47% reduction in tool-heavy workflows), and native computer use capabilities (75% success rate on OSWorld-Verified) makes it suitable for production applications.

Getting Started Checklist:

Create OpenAI account and add billing
Generate API key and store securely
Install OpenAI SDK
Test basic requests
Implement error handling and retry logic
Add monitoring and cost tracking
Gradually adopt advanced features (computer use, tool search, vision)

Start with basic chat completions, then layer in computer use, tool search, and vision as your use cases require. Monitor costs closely during initial deployment and optimize prompts and caching strategies.

Further Reading:

DEV Community