Ben Halpern

Posted on Mar 3

What do you think about building when AI models get cheaper?

#ai #discuss

With Gemini 3.1 Flash-Lite launching today, my mind goes towards thinking about things I wouldn't have thought to build because of expense.

However, when the most inexpensive models get better/cheaper it tends to sort of unlock ways of thinking about features we wouldn't have explored before the cheaper AI-driven tools are possible.

This is kind of abstract but is there a way you think about this sort of thing?

Top comments (25)

Nikola Brežnjak • Mar 3

What @heckno said, and then expand it to any (all!?) hardware devices that I own.

The bigger question is: will it indeed be in the end/soon that SaaS will go away and everyone (almost everyone?) will build the tool that they need specifically tailored for them inhouse?

Ben Halpern • Mar 3

Oh yeah good call

Pascal CESCATO • Mar 3

I don’t think SaaS disappears.

Cheaper AI makes building more accessible, yes. But building something tailored in-house still requires architecture, long-term maintenance, security, evolution, and governance. AI can reduce implementation effort, but it doesn’t remove structural complexity.

In the short term, more teams might build internal tools. In the long term, the real constraint won’t be model cost — it will be software engineering capacity and organizational maturity.

SaaS solves that by externalizing complexity. That doesn’t go away just because tokens are cheaper.

Nikola Brežnjak • Mar 4

I agree that it won't be 100%. No way, of course. But what % do you think it actually will? I'm thinking (boldly even) that the split may actually be as big as 50/50.

Pascal CESCATO • Mar 4

I think there might be quite a few disappointments.

AI makes it easier to build something quickly — even to design something that looks convincing. But without solid structure, that often doesn’t hold up over time.

Building is one thing. Maintaining, evolving, securing, scaling, documenting, and integrating into real workflows is another. AI reduces friction at the start, but it doesn’t remove long-term responsibility.

So I’m not sure the split will be 50/50. I suspect many teams will try building in-house, then rediscover why SaaS exists in the first place.

heckno • Mar 3

I'm curious about stuff that can plug into my smart watch data where I can pipe in basic questions about the trends and get answers. It's not complicated stuff but I want to be able to do in high volume

Ben Sinclair • Mar 4

I don't want to build anything when a hobby that was free is now a subscription. I'll probably use local models when we get better tooling, but the way the world is going, that might not happen. The end game appears to be "your entire computer is a subscription", i.e. thin clients coming back round with everything offloaded to data centres, at which point I'm just not interested any more. I'll get a new interest. Maybe I'll learn to love kayaking.

Daniel Nwaneri • Mar 4

Cheaper inference shifts the constraint rather than removing it. Features I was calling the model once for because five calls was expensive now I call five times. parallel validation passes, redundant retrieval, cross-checking outputs against each other. The architecture changes when the per-call cost stops being the binding constraint.

What doesn't change. whether the use case produces real value. The most interesting builds with cheap inference aren't the ones that were blocked by cost — they're the ones that require volume to work at all. Anomaly detection across thousands of data points. Real-time consistency checking on long documents. evaluation pipelines that run on every output rather than sampling.

The unlock isn't just cheaper features. it's features that only make sense at volume. those were architecturally impossible before not because of cost but because the use case required scale that cost made unviable.

Pascal CESCATO • Mar 3

Lower costs definitely broaden the perspective. But it’s also worth asking whether there’s a real need in the first place — or whether we’re creating one just to take advantage of cheaper AI.

FrancisTRᴅᴇᴠ (っ◔◡◔)っ • Mar 3

There is many factors going into this:

Is it getting cheaper where they have efficient data centers? For my Grad program, we had to do research about how data centers are currently inefficient and seeing how researchers are finding solutions to that problem.
Is it getting cheaper because of how there is "so much competition" in a way where we can use the AI as Open-Source? For example, there is Ollama for example where you can simply download and use it locally without using an API key to the cloud.
Is it getting cheaper because of how the AI model is efficiently using its token?

It's something I keep in mind as new models are being published and accessible for us to use. Of course, the main thing about AI is the environment factor that goes into it because you have to build a lot of Data centers to compute your model. I saw a video where it explains that OpenAI just simply "Made it bigger" and the LLM model just became smarter. It's pretty much the whole reason why there is so many data centers and RAM that is needed just to simply use AI. I might be wrong, but let me know!

Thanks for sharing @ben!

Amara Graham • Mar 3

I had a slightly different take on this recently and it was in the corporate setting, do people actually know or pay attention to tokens or other expense calculations on their AI use. Is it obfuscated?

I've noticed the people who consider expenses with models tend to be the same people who tinker and likely used a personal card at some point. Others openly mention they work for insert big company here and have essentially unlimited spending on certain tools and models.

Dean Reed • Mar 5

I think cheaper models don’t just reduce cost; they change product design.

When inference is expensive, AI gets used for “special moments” in a product. When it’s cheap, it becomes part of the default interaction layer.

That shift unlocks ideas that were previously impractical: background reasoning, continuous agents, auto-generated UI, real-time personalization, etc.

So the intriguing question isn’t just what AI can do, but what becomes possible when AI is cheap enough to run everywhere in the product.

Jesse Piaścik • Mar 3

Great topic for discussion! I think you're right, speed of feature development will continue to increase, but we will always be constrained by the feedback loop. If we don't pay attention to what features we build actually solve problems that people need solving then we're doomed to fail.

Lakshmi Sravya Vedantham • Mar 4

The thing that gets cheaper for me isn't the inference it's the feedback loop. When a call costs $0.001 instead of $0.01, I stop batching validation calls and start running them inline. That changes architecture more than it changes features. The decisions that felt like over-engineering six months ago now just feel like correct defaults.

View full discussion (25 comments)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.