Stop paying for unnecessary reasoning

#api #llm #tooling #python

llm_api_adapter is a lightweight Python library that gives developers back control and reduces costs.

Reasoning in LLMs is powerful, but it is expensive, often unnecessary, and implemented differently across providers.

Here is how major APIs handle reasoning today:

OpenAI enables reasoning by default
Gemini hides it inside thinkingConfig
Claude requires a minimum of 1024 reasoning tokens
gpt-5-nano / gpt-5-mini cannot disable reasoning at all, only reduce it to the minimum

As a result, developers often pay for reasoning even when they don’t need it, simply because the provider API turns it on implicitly or makes controlling it non-obvious.

v0.2.3 — Reasoning is now off by default

In llm_api_adapter v0.2.3, reasoning is disabled by default. If you do not need it, nothing changes in your code and you avoid extra costs. If you do need it, you can enable it with the reasoning_level parameter:

String values: "none", "low", "medium", "high"

Numeric values: 256, 512, 1024, 2048, etc.

The adapter handles everything else:

maps to provider‑specific fields (reasoning_tokens, thinkingConfig, reasoning)
applies correct formats for each API
lowers reasoning to the minimum allowed for models that cannot disable it
prevents misconfiguration errors

Why this matters

Control by default — reasoning is never used unless requested
Lower cost — fewer reasoning tokens means cheaper responses
Unified interface across GPT, Claude, and Gemini
Cleaner code without provider-specific branches

Installation

pip install llm-api-adapter

Example Usage

from llm_api_adapter.universal_adapter import UniversalLLMAPIAdapter

gpt = UniversalLLMAPIAdapter(
    organization="openai",
    model="gpt-5.1",
    api_key=openai_api_key,
)

response = gpt.chat(
    messages=[
        {"role": "user", "content": "Explain quantum computing very simply."}
    ],
    reasoning_level="low",   # enable reasoning only when you need it
)

print(response.content)