DEV Community

Sergey Inozemtsev
Sergey Inozemtsev

Posted on

Python LLM: reasoning is disabled by default in llm-api-adapter

Reasoning improves LLM output quality, but it is expensive and often unnecessary. Worse: most providers enable it implicitly or hide it behind non-obvious parameters.

Result: developers pay for reasoning even when they don’t need it.


The Problem

Reasoning is handled inconsistently across providers:

  • OpenAI: often enabled implicitly.
  • Gemini: controlled via thinkingConfig.
  • Anthropic (Claude): may enforce minimum reasoning tokens (e.g. 1024).
  • Nano/Mini models: sometimes impossible to disable reasoning entirely.

This leads to:

  • hidden costs
  • provider-specific conditionals
  • easy-to-miss misconfiguration

The Approach: Off by Default

Starting from llm_api_adapter v0.2.3, reasoning is disabled by default.

If it is not explicitly enabled:

  • no reasoning tokens are used
  • no extra cost is incurred
  • existing code keeps working

Costly features should be opt-in, not opt-out.


Enabling Reasoning Explicitly

When reasoning is actually required, it can be enabled via a single unified parameter:

  • String levels: "none" | "low" | "medium" | "high"
  • Numeric values: 256, 512, 1024, 2048, etc.

The adapter:

  • maps the value to provider-specific fields
  • applies correct formats per API
  • respects provider minimums
  • prevents invalid configurations

Installation

pip install llm-api-adapter
Enter fullscreen mode Exit fullscreen mode

Example

from llm_api_adapter.universal_adapter import UniversalLLMAPIAdapter

messages = [
    {"role": "user", "content": "Explain quantum computing simply."}
]

# Pick a provider (same interface)
adapter = UniversalLLMAPIAdapter(
    organization="openai",
    model="gpt-5.1",
    api_key=openai_api_key,
)

# or
# adapter = UniversalLLMAPIAdapter(
#     organization="google",
#     model="gemini-2.5-pro",
#     api_key=google_api_key,
# )

# or
# adapter = UniversalLLMAPIAdapter(
#     organization="anthropic",
#     model="claude-sonnet-4-5",
#     api_key=anthropic_api_key,
# )

response = adapter.chat(
    messages=messages,
    reasoning_level="low",  # off by default, enabled explicitly
)

print(response.content)
Enter fullscreen mode Exit fullscreen mode

Why This Matters

  • Lower and predictable costs
  • No accidental reasoning usage
  • Cleaner application code
  • Unified control across OpenAI, Claude, and Gemini

Repository

Source code and documentation: https://github.com/Inozem/llm-api-adapter

Top comments (0)