DEV Community

Cover image for I Think Therefore I Am… A Big Pain in the A$$
Robert Imbeault
Robert Imbeault

Posted on

I Think Therefore I Am… A Big Pain in the A$$

If you’ve tried to build anything serious on top of LLMs recently, you’ve probably run into this:

“Thinking” is supposed to make models better.
In practice, it makes your infrastructure worse.

Let’s break down where it actually hurts.


The illusion of “just turn on reasoning”

At a high level, you’d expect something simple:

  • Turn reasoning on → better answers
  • Turn reasoning off → cheaper, faster

Reality is messier.

  • Models sometimes don’t think when you ask them to
  • Models sometimes overthink trivial prompts, burning tokens for no gain
  • There’s no consistent behavior across providers

So now you’re not just building a product.
You’re debugging model psychology.


The fragmentation problem nobody talks about

Every provider decided to implement “thinking” differently.

  • OpenAI → effort levels (low, medium, high)
  • Anthropic → token budgets (explicit caps)
  • Google → both… depending on the model version

That’s just inputs.

Outputs are worse:

  • Some return dedicated thinking blocks
  • Others return reasoning summaries
  • Some mix reasoning into standard content structures

There is no standard. No shared schema. No predictable behavior.

So if you’re routing across models, you now need:

  • Input normalization
  • Output parsing per provider
  • Logic to reconcile different reasoning formats

This is where “simple API routing” stops being simple.


Billing is inconsistent too

Even cost modeling breaks.

  • Some providers expose reasoning tokens explicitly
  • Some hide it inside total usage
  • Some introduce provider-specific fields (looking at you, xAI)

Now you’re not just optimizing performance.
You’re building a cost translation layer.


Model switching makes everything worse

Switching models mid-thread sounds great… until you try it.

Even within the same provider:

  • Different endpoints behave differently (yes, even inside OpenAI)
  • Input formats change
  • Output structures change
  • Reasoning formats change

Now add state:

  • What context do you carry over?
  • How do you preserve reasoning continuity?
  • How do you avoid exploding token usage?

This is where most teams either:

  1. Give up on portability, or
  2. Build a fragile pile of adapters that break every few weeks

What we realized building Backboard

The real problem isn’t reasoning.

It’s lack of abstraction.

Developers shouldn’t have to:

  • Learn 5 different “thinking” systems
  • Normalize 5 different response formats
  • Track 5 different billing models
  • Rebuild state every time they switch models

So we made a call:

Unify it.


What “unified thinking” actually means

Instead of exposing provider quirks, we abstract them into one model:

  • A single thinking parameter
  • Direct control over reasoning budget
  • Consistent behavior across models
  • Normalized input and output structures

So you can:

  • Tune reasoning without caring about provider differences
  • Switch models without rewriting logic
  • Keep state intact across everything

And most importantly:

Stop thinking about thinking.


The uncomfortable truth

If you’re building on multiple LLMs and you haven’t hit these issues yet, you will.

The complexity is not obvious at the start.
It compounds as soon as you:

  • Add a second provider
  • Introduce reasoning
  • Try to optimize cost
  • Or maintain state across sessions

At that point, you’re not building your product anymore.
You’re building infrastructure.


Where this goes long term

Short term:

  • Abstractions like this save teams weeks or months of engineering time
  • They reduce cost volatility and debugging overhead

Long term:

  • The winning platforms won’t be the best models
  • They’ll be the ones that make models interchangeable and stateful

That’s the real unlock.


If you’re currently stitching together multiple providers, do a quick audit:

  • How many reasoning formats are you handling?
  • How portable is your state layer?
  • How confident are you in your cost predictability?

If the answer isn’t clean, you’re already paying the tax.

This is what we're working on at Backboard.io :)

Top comments (1)

Collapse
 
jon_at_backboardio profile image
Jonathan Murray

Love the title, so true in more ways than one... but... likewise. But in all seriousness, the progress you and the product team have made on this is incredible. So proud to call you a team mate and co-founder!