Python LLM: reasoning is disabled by default in llm-api-adapter

#performance #api #python #llm

Reasoning improves LLM output quality, but it is expensive and often unnecessary. Worse: most providers enable it implicitly or hide it behind non-obvious parameters.

Result: developers pay for reasoning even when they don’t need it.

The Problem

Reasoning is handled inconsistently across providers:

OpenAI: often enabled implicitly.
Gemini: controlled via thinkingConfig.
Anthropic (Claude): may enforce minimum reasoning tokens (e.g. 1024).
Nano/Mini models: sometimes impossible to disable reasoning entirely.

This leads to:

hidden costs
provider-specific conditionals
easy-to-miss misconfiguration

The Approach: Off by Default

Starting from llm_api_adapter v0.2.3, reasoning is disabled by default.

If it is not explicitly enabled:

no reasoning tokens are used
no extra cost is incurred
existing code keeps working

Costly features should be opt-in, not opt-out.

Enabling Reasoning Explicitly

When reasoning is actually required, it can be enabled via a single unified parameter:

String levels: "none" | "low" | "medium" | "high"
Numeric values: 256, 512, 1024, 2048, etc.

The adapter:

maps the value to provider-specific fields
applies correct formats per API
respects provider minimums
prevents invalid configurations

Installation

pip install llm-api-adapter

Example

from llm_api_adapter.universal_adapter import UniversalLLMAPIAdapter

messages = [
    {"role": "user", "content": "Explain quantum computing simply."}
]

# Pick a provider (same interface)
adapter = UniversalLLMAPIAdapter(
    organization="openai",
    model="gpt-5.1",
    api_key=openai_api_key,
)

# or
# adapter = UniversalLLMAPIAdapter(
#     organization="google",
#     model="gemini-2.5-pro",
#     api_key=google_api_key,
# )

# or
# adapter = UniversalLLMAPIAdapter(
#     organization="anthropic",
#     model="claude-sonnet-4-5",
#     api_key=anthropic_api_key,
# )

response = adapter.chat(
    messages=messages,
    reasoning_level="low",  # off by default, enabled explicitly
)

print(response.content)