Learn how to log every OpenAI API call in production — inputs, outputs, tokens, latency, and costs. Complete guide with code examples for GPT-4o and LangChain.
Originally published at logwick.io/blog
OpenAI's dashboard shows you aggregate token usage and costs. What it doesn't show you is which specific calls failed, what prompts triggered unexpected responses, how latency varies by user, or which features are actually consuming your budget. For that you need your own logging.
Why OpenAI's built-in logging isn't enough
When a user reports that your AI feature gave them a bad response, the OpenAI dashboard can't tell you:
- What prompt triggered it
- What the model returned
- How long it took
- Whether the problem is reproducible
What you actually need in production:
Input/output logging — The exact prompt sent and response received for every call, so you can reproduce and debug issues.
Per-request token counts — Not just total usage, but tokens per call, per user, per feature — so you know what's expensive.
Latency tracking — How long each call takes. GPT-4o can vary from 500ms to 30 seconds depending on prompt length and load.
Error logging — Rate limit errors, timeouts, and content policy rejections need to be tracked separately from successful calls.
User attribution — Which of your users triggered each call, so you can debug customer-specific issues and allocate costs.
What to log on every OpenAI API call
At minimum, every log entry should capture these fields:
{
"agent": "gpt-4o",
"action": "email_draft",
"status": "success",
"input": "Draft a follow-up email...",
"output": "Subject: Following up...",
"tokens": 312,
"latency_ms": 1842,
"cost_usd": 0.0021,
"user": "user@example.com"
}
The action field is important — it's the business context for the call. Not just "gpt-4o was called" but "an email was drafted". This lets you filter logs by feature and understand which parts of your product are working.
Basic logging with fetch
The simplest approach — make your OpenAI call, then immediately fire a log. The log call is fire-and-forget so it never blocks your response.
const start = Date.now()
// Your existing OpenAI call — unchanged
const response = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: userPrompt }]
})
const output = response.choices[0].message.content
const latency = Date.now() - start
// Log immediately after — fire and forget
fetch('https://logwick.io/api/v1/logs', {
method: 'POST',
headers: {
'Authorization': 'Bearer ' + process.env.LOGWICK_API_KEY,
'Content-Type': 'application/json'
},
body: JSON.stringify({
agent: 'gpt-4o',
action: 'email_draft',
status: 'success',
input: userPrompt,
output: output,
tokens: response.usage.total_tokens,
latency_ms: latency,
user: currentUser.email,
cost_usd: response.usage.total_tokens * 0.000005
})
}).catch(() => {}) // never throws, never blocks
return output
Building a reusable wrapper
If you're making OpenAI calls in multiple places, a wrapper function avoids repeating the logging code everywhere:
// lib/ai.js
import OpenAI from 'openai'
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY })
const LOGWICK_KEY = process.env.LOGWICK_API_KEY
export async function chat(messages, { action, user, model = 'gpt-4o' } = {}) {
const start = Date.now()
const prompt = messages.map(m => m.content).join('\n')
try {
const response = await openai.chat.completions.create({ model, messages })
const output = response.choices[0].message.content
log({ action, user, model, status: 'success',
input: prompt, output,
tokens: response.usage.total_tokens,
latency_ms: Date.now() - start })
return output
} catch (err) {
log({ action, user, model, status: 'error',
input: prompt, output: err.message,
latency_ms: Date.now() - start })
throw err
}
}
function log(data) {
if (!LOGWICK_KEY) return
fetch('https://logwick.io/api/v1/logs', {
method: 'POST',
headers: {
'Authorization': 'Bearer ' + LOGWICK_KEY,
'Content-Type': 'application/json'
},
body: JSON.stringify({ agent: data.model, ...data })
}).catch(() => {})
}
Usage anywhere in your app:
import { chat } from './lib/ai'
const reply = await chat(messages, { action: 'email_draft', user: req.user.email })
Using the Logwick SDK (fastest approach)
Install the SDK:
npm install logwick
Then wrap your existing OpenAI call — nothing else changes:
import { LogwickClient } from 'logwick'
const logwick = new LogwickClient({ apiKey: process.env.LOGWICK_API_KEY })
// One line wraps your existing call
const result = await logwick.openai(
() => openai.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: prompt }]
}),
{ action: 'email_draft', user: req.user.email }
)
// result is the normal OpenAI response object
const reply = result.choices[0].message.content
The wrapper automatically captures timing, token usage, the input prompt, the output, and logs errors.
Tracking costs per user and feature
GPT-4o pricing as of 2026:
function calculateCost(usage) {
// GPT-4o: $2.50/M input tokens, $10.00/M output tokens
const inputCost = (usage.prompt_tokens / 1_000_000) * 2.50
const outputCost = (usage.completion_tokens / 1_000_000) * 10.00
return inputCost + outputCost
}
logwick.fire({
agent: 'gpt-4o',
action: 'email_draft',
tokens: response.usage.total_tokens,
cost_usd: calculateCost(response.usage),
user: currentUser.email
})
By logging user and action alongside cost_usd, you can answer: which customer costs the most to serve, and which feature is most expensive.
Logging errors and timeouts
OpenAI errors fall into three categories: rate limit errors (429), timeouts, and content policy rejections. Log all of them:
try {
const response = await openai.chat.completions.create({ ... })
logwick.fire({ status: 'success', ...data })
} catch (err) {
logwick.fire({
agent: 'gpt-4o',
action: action,
status: 'error',
input: prompt,
output: err.message,
latency_ms: Date.now() - start,
user: user,
metadata: {
error_type: err.constructor.name,
error_code: err.status
}
})
throw err
}
Logging LangChain + OpenAI calls
If you're using LangChain, use a callback handler to log every LLM call automatically:
import { LogwickCallbackHandler } from 'logwick'
const handler = new LogwickCallbackHandler(logwick, {
user: req.user.email
})
const chain = new LLMChain({
llm,
prompt,
callbacks: [handler]
})
const result = await chain.call({ input: userQuery })
// Logwick has already logged the call — no extra code needed
Viewing and searching your logs
Once you're logging, you can query via API:
# Get all errors from the last 24 hours
curl "https://logwick.io/api/v1/logs?status=error&from=2026-05-01" \
-H "Authorization: Bearer sk-lw-your-key"
# Stream logs in real time via SSE
curl -N "https://logwick.io/api/v1/logs/stream?status=error" \
-H "Authorization: Bearer sk-lw-your-key"
# Get stats for the last 7 days
curl "https://logwick.io/api/v1/stats?days=7" \
-H "Authorization: Bearer sk-lw-your-key"
Or if you use Claude Desktop, connect the Logwick MCP server and ask in plain English: "Show me all failed email_draft calls from yesterday" or "How much did we spend on GPT-4o this week?"
Get started free
Logwick is free for up to 5,000 logs/month — no credit card required. Sign up and have your first OpenAI call logged in under 3 minutes.
- Free: 5,000 logs/month, 7-day retention
- Pro: $29/month, 100,000 logs, 90-day retention, webhooks
- npm:
npm install logwick - Docs: logwick.io/docs
Top comments (0)