DEV Community

Sergey Inozemtsev
Sergey Inozemtsev

Posted on

Stop paying for unnecessary reasoning

llm_api_adapter is a lightweight Python library that gives developers back control and reduces costs.

Reasoning in LLMs is powerful, but it is expensive, often unnecessary, and implemented differently across providers.

Here is how major APIs handle reasoning today:

  • OpenAI enables reasoning by default
  • Gemini hides it inside thinkingConfig
  • Claude requires a minimum of 1024 reasoning tokens
  • gpt-5-nano / gpt-5-mini cannot disable reasoning at all, only reduce it to the minimum

As a result, developers often pay for reasoning even when they don’t need it, simply because the provider API turns it on implicitly or makes controlling it non-obvious.

v0.2.3 — Reasoning is now off by default

In llm_api_adapter v0.2.3, reasoning is disabled by default. If you do not need it, nothing changes in your code and you avoid extra costs. If you do need it, you can enable it with the reasoning_level parameter:

String values: "none", "low", "medium", "high"

Numeric values: 256, 512, 1024, 2048, etc.

The adapter handles everything else:

  • maps to provider‑specific fields (reasoning_tokens, thinkingConfig, reasoning)
  • applies correct formats for each API
  • lowers reasoning to the minimum allowed for models that cannot disable it
  • prevents misconfiguration errors

Why this matters

  • Control by default — reasoning is never used unless requested
  • Lower cost — fewer reasoning tokens means cheaper responses
  • Unified interface across GPT, Claude, and Gemini
  • Cleaner code without provider-specific branches

Installation

pip install llm-api-adapter
Enter fullscreen mode Exit fullscreen mode

Example Usage

from llm_api_adapter.universal_adapter import UniversalLLMAPIAdapter

gpt = UniversalLLMAPIAdapter(
    organization="openai",
    model="gpt-5.1",
    api_key=openai_api_key,
)

response = gpt.chat(
    messages=[
        {"role": "user", "content": "Explain quantum computing very simply."}
    ],
    reasoning_level="low",   # enable reasoning only when you need it
)

print(response.content)
Enter fullscreen mode Exit fullscreen mode

Repository

https://github.com/Inozem/llm-api-adapter

Top comments (0)