DEV Community

Cover image for How to waste half a million dollars
Nočnica Mellifera for RudderStack

Posted on

How to waste half a million dollars

Originally posted in 2020 but we're talking about the real costs of app hosting again so it feels relevant

Every startup I’ve worked for had the same mission statement. It might not have been the mission statement etched into a plaque on the wall, but it existed just the same. Here it was:

Make a product so good that people who don’t work for you are excited to tell their friends. Charge people for the product. Make money.

The ‘how’ is missing from this statement, but it’s not the point of this article. I want to talk about how teams get distracted. How some of your best, brightest, most effective people can spend all day working on things that won’t help this mission at all.

How we go from ‘make money’ to ‘run servers’

No one starts a company with a goal of getting really good at running servers. They start with a product or service that they want to deliver to customers.

hey for this whole article I’m pretending like platform/hosting startups don’t exist. Please let me live. Also, this example is centered around a startup but is equally applicable for orgs of all sizes.

Let’s create a company called GoCo. They have a fantastic product that will improve the lives of other software companies and want to take it to the world. The first version of the product will be hosted on something like Heroku. Heroku handles all the updating, patching, and general concerns of making servers serve. Once the product is built GoCo delivers demos, gets some first users, and generates interest at meetups and conferences.

And for a time it is good.

Then as the product grows, and the user base grows, the CFO of GoCo says:

“We’re spending hundreds of dollars on Heroku, we need to set up our own servers to reduce that bill”

One of the most capable engineers at GoCo, let’s call her Grace, spends a couple weeks getting the application running on low cost virtual machines. This is cheaper than a platform and sure enough, the bill goes down.

Money saved, right?

Even right here our theoretical startup has made an error. They saw a reduction of a bill and said they saved money. What everyone forgot was that they paid Grace the engineer for several weeks of work to make this changeover happen. Did the cash saved even equal what they paid, in cash, for Grace to do the work? Probably not, but it gets worse because in that same time Grace could have built something for the product.

What could your best engineers build if they weren’t trying to save you a few hundred bucks on servers?

By tasking Grace the engineer to save you on operational costs, GoCo was depriving itself of the new features and optimizations that could have been delivered in that time. Missing features don’t show up on a balance sheet, but if we don’t deliver cool features and better product there is no way the startup is going to make money.

The #1 best way for a company to save money is to fire everyone and go out of business. And that’s not in anyone’s mission statement.

It gets worse: eventually servers become a full-time job

Sooner or later GoCo’s servers go down. That discount server farm they use only guarantees that it won’t catch on fire or be physically penetrated, but patching, updating, and configuring servers is GoCo’s job. And Grace, who is just as smart and capable as can be, isn’t even a full-time operations engineer and is probably evaluated on how many features she ships and not how well she patches servers.

GoCo’s executives sit down for a ‘post-mortem:’ the servers went down during a critical conference and everyone is upset. The solution at the time seems clear: GoCo needs an operations specialist or maybe a whole team.

There may be some discussion about what the new salary for this full-time Ops specialist will cost, there probably won’t _be a discussion of how bringing on a new engineer can cost $50,000 in engineer time, bonuses, and start up time. And there definitely won’t be a discussion of how suddenly one of our highest-paid employees is someone whose job has _absolutely nothing to do with the mission.

I’ve seen executives complain about the cost of customer support, about the cost of team building events and of course the cost of software services. I’ve never seen one complain that some portion of the engineering team no longer deliver great features but instead deliver great servers.

No one ever says ‘what if we got rid of all these servers?’

A year later and GoCo is thriving. Customers are excited, and marketing is talking about a GoCoCon in 2021. The new operations team has great reviews, and in all-staff meetings they celebrate delivering “Five Nines” or 99.999% uptime.

And again, and I’m sorry if I’m hammering the same point, no one ever sits down and says ‘What would Heroku or another Platform-as-a-Service cost us compared to what our entire operation teams costs us?” Because I would submit that a team of 3-4 operations engineers is costing GoCo half a million dollars a year.

  • Salary
  • Benefits
  • Stock options
  • Hiring and EOY bonuses
  • Communications overhead
  • Office Overhead

I think $500,000 is a conservative guess for these costs.

And think most managers need to go back and take a look at Platform-as-a-Service as a way to host their products. A comparison with a low-cost EC2-style service should not be based on the relative service bill, but what you’ll spend on the engineers you’ll need and how your entire team will be affected by losing sight of the mission.

Top comments (6)

pinotattari profile image
Riccardo Bernardini

This is basically a variation on the old theme of "buy or make." You need something (say, a wireless interface) for your product; do you make it yourself or do you just buy it "key-turn ready"?

The real answer usually depends on the volume of your product: beyond a given threshold it is possible that the cost of design your interface, producing, testing it, certifying it and so on... is overcome by the saving you obtain by not paying the margin of the wireless interface producer.

I do not have actual figures at hand, but my feeling is that in order to be more convenient to run your own server farm, you need a huge amount of traffic.

nocnica profile image
Nočnica Mellifera

It comes up a lot around Kubernetes that, for it to be worth it to do it yourself you need to plan to compete on performance and reliability. If customers will choose your service because it's faster, then it makes sense, whatever your volume, to build it yourself.

jonlauridsen profile image
Jon Lauridsen

I wholly agree with the core point, which I read to be about how easily opportunity cost gets neglected in software. The many “half-a-day”s lost waiting for the ops engineer to be available for questions, the half-a-days lost waiting for a configuration change, the training of new employees in your custom scripts, etc. etc. The bill may be high for a managed solution, but the hidden costs are deep for doing it yourself.

kallmanation profile image
Info Comment hidden by post author - thread only accessible via permalink
Nathan Kallman

This seems like an exact repost of an article posted under /heroku in mid 2020 which seems to be deleted now... the evidence I can find of it is a tweet from then:

Mind explaining the delete and repost under RudderStack?

trueneu profile image
Pavel Gurkov

I don't see any "other side" in the article. Sure, 4 ops people cost half a mil. How many people would you hire so they make the magic cloud run? How much the service would cost? How much is lost in communications overhead with Heroku, or Amazon, or Google, or Microsoft when they respond to your ticket about half of your stuff not working within an hour - as that's their SLA - but you're bleeding money right now?

trueneu profile image
Pavel Gurkov

You're kinda exaggerating in both ways: first, someone wants to buy their own server because hundreds of dollars is spend on Heroku (did they check how much servers actually cost?); second, a $500 Heroku job, if we're talking compute, probably can be done with one server. Okay, 2 for redundancy. You just don't need 4 people maintaining two boxes. This doesn't go along with illustrating your point.

Some comments have been hidden by the post's author - find out more