Benedict (dejaguarkyng)

Posted on Apr 28

We were spending ~$5K/month on AI compute… so I stopped choosing GPUs

#ai #machinelearning #devops #cloud

I was leading a project running a bunch of AI jobs.

The models weren't huge, but our compute bill kept growing.

Turns out the problem wasn't the models — it was how we were running them.

The real issue

Every job came with decisions like:

And every wrong decision had consequences:

We weren't building AI.

We were managing GPUs.

At some point I stopped trying to optimize setups and asked:

Why are we choosing GPUs at all?

Why does every dev need to think about hardware, providers, capacity, and pricing just to run a job?

I built Jungle Grid — a simple way to run AI workloads without dealing with GPUs.

Instead of picking hardware, you just describe the workload.

Inference example:

jungle submit --workload inference --model-size 7

Batch example:

jungle submit --workload batch --image python:3.11 --command python script.py

That's it.

There's also an API if you want to integrate it into your own services.

But the biggest win is focus.

We went from:

"Will this run?"

to:

"What should we build next?"

The hard part isn't running AI.

It's all the decisions before execution.

Remove those — and everything gets simpler.

If you're running AI workloads, how are you handling GPUs today?