Pavel Espitia

Posted on Jun 22

Why I Log response.model on Every Claude Call (and You Should Too)

#ai #typescript #productivity #security

It is a one-line habit that has saved me more debugging time than any clever abstraction: I log which model actually answered every Claude request. Not which model I asked for. Which one responded. In 2026, with model fallbacks, fast-changing model strings, and routing logic, those are not always the same thing. Here is why the gap exists and what it has caught.

The request model and the response model can differ

You send model: "claude-fable-5". You assume Fable 5 answered. But the response object tells you what actually served the request:

const response = await client.messages.create({
  model: "claude-fable-5",
  max_tokens: 16000,
  thinking: { type: "adaptive" },
  messages: [{ role: "user", content: prompt }],
});

console.log("requested fable-5, served:", response.model);

Most of the time they match. The interesting cases are when they do not.

The Fable 5 safeguard fallback

The reason this matters most in 2026 is Fable 5's safeguards. Fable 5 has hard guardrails in cybersecurity, biology, chemistry, and health. If your prompt trips one, the request does not just refuse. It silently falls back to Opus 4.8 to produce a safe answer.

For my security work, this is a sharp edge. I run contract-analysis prompts that can look adversarial to a safeguard. If one trips, I am quietly getting Opus 4.8 output while believing I am getting Fable-tier reasoning. The quality difference on a hard audit is exactly the thing I paid double for, and it vanished without an error.

The only way to know is to read response.model:

if (!response.model.startsWith("claude-fable-5")) {
  logger.warn(
    { requested: "claude-fable-5", served: response.model },
    "Fable 5 request fell back, likely a safeguard trip",
  );
}

Without that log, a fallback is invisible until I notice the analysis got worse and have no idea why.

Routing logic and config drift

The other source of the gap is my own code. I have routing that picks a model based on task type and a config that sets defaults. It is easy for those to drift: a config change points "deep audit" at the wrong model, or a routing bug sends classification to Opus when it should hit Haiku. Logging the served model surfaces this immediately.

logger.info(
  { task: taskType, requested: chosenModel, served: response.model, tokens: response.usage },
  "llm call",
);

When my bill spiked one week, this log told me in thirty seconds that a routing change had sent a high-volume path to Opus 4.8 instead of Haiku 4.5. The five-times cost difference between those was the whole spike. Without the served-model log I would have spent an afternoon guessing.

It is also your migration safety net

When a new model ships and I bump a model string, the served-model log is how I verify the change took effect everywhere. The migration guidance even recommends asserting on it:

const r = await client.messages.create({
  model: "claude-opus-4-8",
  max_tokens: 64,
  messages: [{ role: "user", content: "ping" }],
});
console.assert(r.model.startsWith("claude-opus-4-8"), `got ${r.model}`);

A stray hardcoded model string in some helper I forgot about shows up the moment the served model does not match what I expect.

Log the usage while you are at it

Since you are already logging the model, log response.usage in the same line. It gives you input_tokens, output_tokens, and the cache fields. That single structured log line becomes your cost dashboard: which task type, which model, how many tokens, how much was served from cache. Three fields you will want eventually, captured for free now.

const { input_tokens, output_tokens, cache_read_input_tokens } = response.usage;
logger.info({ model: response.model, input_tokens, output_tokens, cache_read_input_tokens }, "llm");

The habit

It costs one line and a structured logger. It catches silent model fallbacks, config drift, routing bugs, cost spikes, and incomplete migrations. The unifying theme is that what you asked for and what you got are different facts, and only one of them is in your code. The other is in the response. Read it, log it, and you will know things about your LLM usage that would otherwise only surface as a confusing bill or a quality regression with no explanation.

Top comments (2)

Alex Shev • Jun 22

Logging the actual response model is such a small habit, but it changes debugging completely. Without it, latency, quality, and cost regressions all get blamed on "the prompt" even when the real cause was routing or fallback behavior.

Mike Czerwinski • Jun 22

Logging response.model turns the smallest possible deterministic gate against autonomous-system drift into a habit that costs nothing — and the Fable 5 fallback example is the one that makes it sharp. The cases that crash are easy. The cases that quietly produce plausible output of the wrong quality are the ones that need this log most. Quality regression with no apparent cause is the signature of a silent substitution upstream, exactly where the harder failure mode lives.

The thing that turns it from "good practice" into structure is the assertion form: console.assert(r.model.startsWith("claude-opus-4-8")) doesn't depend on anyone remembering to read the log. That's the difference between "you'd notice eventually" and "the system tells you when reality drifted from intent." Same shape as audit logs in agent-security tooling, scaled down to a single field.

One extension worth naming: pair the served-model field with whatever spec the call was meant to satisfy. A served-model mismatch is one signal; a served-model mismatch on a call that was supposed to enforce a particular reasoning depth is the actually load-bearing alert. The 30-second debug story is real, but the same log structure becomes a drift detector against your own routing config over months, not minutes.