IderaDevTools

Posted on Mar 25 • Originally published at blog.filestack.com

The Compute Cost of File & Image Processing at Scale

#filestack

Any app that lets users upload files like profile pictures, documents, videos, or receipts needs a file processing pipeline. This pipeline handles tasks such as resizing images, converting file formats, extracting data, and compressing videos.

Most engineering teams think of this as just another feature. But it’s actually something bigger. It becomes a continuous infrastructure layer that runs all the time and quietly uses cloud resources and engineering effort.

Think of it like building your own power plant. You get full control, but you’re also responsible for everything: the machines, the fuel, maintenance, and fixing problems even in the middle of the night when something breaks.

The problem is not that file processing is expensive. The problem is that its true cost is almost never calculated.

The expenses are usually spread across different parts of your cloud bill, compute usage, storage, bandwidth, and other services, including downstream delivery costs when processed images and videos are delivered to users.

Because these costs are scattered, they’re easy to miss. By the time teams notice the impact on the budget, the costs have often been growing for years.

This guide isn’t about how to build a file processing pipeline.

Instead, it focuses on what it really costs to run one and how to explain those costs when deciding whether to change your approach.

Before diving deeper, here are the key ideas from this file processing compute cost analysis.

Key Takeaways

File processing (image resize, video encode, OCR) is infrastructure, not just a feature.
The real cost includes compute, memory, storage, orchestration, and monitoring.
Engineering time to maintain the pipeline is often the highest hidden cost.
Calculating cost per operation helps teams understand the true expense.
A build vs. buy comparison should focus on long-term cost and scalability.

To understand where these costs come from, we need to break down the infrastructure that powers a typical file processing pipeline.

Deconstructing the Compute Cost Stack

The total cost of file processing isn’t just a single number. It’s made up of at least five distinct cost layers that build on top of each other.

Each layer adds its own expense, and together they create the real cost of running a file processing system.

To understand the full picture, you first need to break these layers apart and look at them individually. That’s the first step toward accurately measuring how much your file processing pipeline actually costs.

The diagram below shows the main layers that contribute to the total cost of a file processing pipeline.

Layer 1 — Core Processing: CPU & Compute Cycles

This is the most obvious cost: the compute power needed to process a file.

When a file is uploaded, the system may need to resize images, crop them, add watermarks, convert formats, or extract text. These tasks use CPU resources. Video processing is even heavier. Encoding 4K video can require 5–10× more CPU and memory than standard HD encoding. Tasks like OCR or extracting text from detailed documents add even more processing work.

Another thing teams often overlook is that these workloads are not consistent.

For example, resizing a batch of small thumbnails uses very little compute. But transcoding the same number of short videos requires much more processing power.

Because of this, the types of files and operations in your pipeline directly affect your compute costs. And that mix rarely stays the same; it keeps changing as product features and user behaviour evolve.

Layer 2 — Memory & Storage I/O

Large files also need to stay in memory while they are being processed. For example, a high-resolution image that needs to be exported into multiple sizes may require several gigabytes of RAM to store intermediate versions during processing. Videos usually require even more memory.

Because of this, worker machines have to be sized for the most complex files, not the average ones. Cloud providers charge for the amount of RAM allocated per hour, even if that memory isn’t fully used all the time.

Another cost that teams often miss is storage I/O. Files need to be read from storage into the processing system and then written back after processing is finished. When this happens at a large scale, the read and write operations add noticeable cost, especially if the pipeline processes the same file multiple times.

Layer 3 — Orchestration & Queueing Infrastructure

File processing doesn’t happen on its own. A production pipeline needs several supporting systems to keep everything running.

Typically, this includes a message queue to receive and distribute processing jobs, a group of worker servers that actually process the files, a load balancer to route requests, and a storage system where the processed files are temporarily kept before delivery.

Each of these components adds its own cost. Even when the system isn’t processing files, many of these services still need to stay running.

Another important point is that compute costs don’t scale in a simple, linear way. Processing 10,000 files in a short burst doesn’t just cost 10 times more than processing 1,000 files.

When traffic spikes, the system has to deal with queue limits, delays while new workers start, and retry logic when jobs fail. These orchestration challenges create scaling effects that are hard to predict in advance and often expensive to fix later.

Layer 4 — Idle Capacity & Over-Provisioning

User uploads rarely happen at a steady rate. Activity usually comes in spikes. A product launch, a viral post, a Black Friday sale, or a seasonal campaign can suddenly increase uploads many times above the normal level.

If you run your own pipeline, the infrastructure must be able to handle these peak moments. That means keeping enough worker servers ready for the highest possible load. But during most of the time, often 90% or more, many of those servers sit idle while still generating cloud costs.

This isn’t a mistake in engineering. It’s simply how infrastructure works when the workload changes a lot.

The only choices are:

Under-provisioning: Fewer resources, which can cause failures or delays during traffic spikes.
Over-provisioning: Extra capacity that stays unused most of the time but still costs money.

Most teams end up over-provisioning to avoid outages, which means continuously paying for capacity that isn’t always used.

Layer 5 — Monitoring, Alerting & Operational Reliability

A file processing pipeline also needs visibility into how it’s running. Without monitoring, it becomes very hard to know when something breaks or slows down.

Teams usually add systems for logging pipeline activity, tracking metrics like queue size or processing time, setting up alerts when jobs fail, and building dashboards to see the overall health of the system.

All of this requires additional tools and infrastructure. Some teams use managed observability platforms, while others run their own monitoring stack. In either case, there is a cost involved, and that cost often grows as the pipeline becomes more complex.

Infrastructure costs are only part of the picture. The next layer of cost is less visible but often much larger: engineering time.

The Toil Multiplier: Engineering Time Is Your Largest Cost

Infrastructure costs are only the visible part of the problem. A much higher cost often sits beneath the surface, the engineering time required to build, maintain, and run a file processing pipeline.

This type of work is often called toil. It includes repetitive, manual, and reactive tasks such as maintaining systems, fixing failures, updating dependencies, and keeping infrastructure running.

Toil doesn’t directly create new product features. Instead, it focuses on keeping the infrastructure working, which means it quietly consumes valuable engineering time that could otherwise be spent building the product.

Development & Ongoing Maintenance

The initial development investment required to build a reliable processing pipeline is often significant. Even after the core system is built, the work doesn’t stop.

Teams still need to manage library updates, tools like ImageMagick and FFmpeg regularly release security patches and sometimes introduce breaking changes. Engineers also have to handle unexpected edge cases in file formats and update the pipeline when the product starts supporting new file types or processing requirements. Many teams run into common architectural pitfalls when building ingestion pipelines from scratch.

This means the work is not a one-time effort. Maintaining the pipeline becomes an ongoing responsibility, a recurring demand on one of the most expensive resources in a company: senior engineering time.

On-Call Burden & Incident Response

Processing pipelines don’t always run smoothly. Queues can get backed up, workers may crash, or a malformed file might trigger an unexpected error that spreads through the job queue. These kinds of issues are common in systems that operate at scale, and they often happen outside normal working hours, requiring engineers to step in and fix them.

The cost of being on-call isn’t just the time spent resolving incidents. It also includes the mental load of being responsible for a system that must always stay reliable. Interruptions from incidents can pull engineers away from product work, slow down development momentum, and, over time, can even affect engineer satisfaction and retention.

Performance Tuning & Cost Optimisation

A pipeline that works efficiently at 10,000 operations per day may become inefficient at 10 million operations per day. As usage grows, teams need to continuously optimise the system to keep costs and performance under control.

This often includes improving worker utilisation, setting up CDN strategies to avoid repeated processing, adjusting queue configurations, and choosing the right instance sizes. All of these require ongoing engineering effort.

Each optimisation project takes valuable senior engineering time, time that could otherwise be spent building product features that users actually care about.

The opportunity cost question every CTO should ask: What could two engineers build in a year if they weren’t maintaining the processing pipeline? When the decision is framed this way, the build-vs-buy choice often becomes much clearer.

Once these infrastructure and engineering costs are understood, the next step is turning them into a measurable number.

Calculating Your True Cost Per Operation

Most organisations have never calculated the cost per operation for their file processing pipeline. Yet this number is often the most useful way to understand the real financial impact of the system.

The basic method is simple. You add up the total costs involved in running the pipeline and divide that by the number of processing operations it handles.

The challenge is not the formula; it’s collecting the inputs. The costs are usually scattered across cloud services, infrastructure, and engineering time, so they require some digging to identify.

But once you calculate this number, it becomes a powerful metric. It provides a clear way to discuss the pipeline’s impact with finance and helps teams make more informed decisions about their infrastructure.

This can be represented with a simple cost-per-operation formula.

To calculate this for your own pipeline, collect the following inputs from your cloud provider’s billing dashboard and your team’s internal time tracking:

Compute: The average vCPU-hours used per operation type (such as resizing images, encoding video, or running OCR). This information is usually available in your cloud provider’s compute metrics.
Memory: The average GB-hours of RAM used per operation. You can typically find this in instance monitoring or infrastructure metrics.
Orchestration: The total monthly cost of supporting infrastructure (queues, worker servers, load balancers) divided by the total number of operations processed in that month.
Engineering Toil: The number of engineering hours spent each month maintaining the pipeline, multiplied by the fully loaded hourly cost of those engineers.
Incident Cost: The cost of on-call work and incident response, estimated from on-call schedules, logs, and postmortem reports.

Add all of these costs together and divide the total by the number of operations processed in a month. The result is a clear and defensible cost-per-operation figure that you can present to finance leadership.

Once you understand your true cost per operation, you can make a more informed build-vs-buy decision.

The Build-vs-Buy Economic Model: A 3-Year TCO View

Comparing the cost of building vs. buying at a single point in time can be misleading. The more useful approach is to look at the total cost over several years.

As your product grows, the number of files processed increases, infrastructure requirements expand, and the pipeline becomes more complex to maintain. At the same time, your engineering team grows, and operational needs become heavier.

Because of this, the real question isn’t just what it costs today, but how those costs add up over the next few years as the system scales and new requirements appear.

The most important insight in this comparison isn’t any single cost item. It’s the fundamental difference between the two cost models.

When you run file processing in-house, costs are variable and tend to grow over time. As usage increases, you process more files, add more infrastructure, and often need more engineering time to maintain the system.

A managed API shifts this model. Instead of managing infrastructure and operational complexity, the cost becomes usage-based and easier to predict.

For finance and procurement teams, this difference is often just as important as the total cost itself. Predictable, operational expenses are typically easier to plan, budget, and scale compared to infrastructure costs that fluctuate with system complexity and team involvement.

To see how these cost models behave in real situations, consider the following scenario.

Case Study: When Scale Arrives Overnight

This situation happens often in both consumer apps and B2B SaaS products. A company launches a new feature, maybe user-generated video uploads, collaborative document annotation, or AI-powered image analysis, and the feature suddenly becomes extremely popular.

Within a few days, usage grows much faster than expected. In some cases, processing volume can jump to 10× the normal level in less than 72 hours.

The In-House Response

Engineers get alerts because the system is struggling. The job queue starts filling up, and the worker servers are already running at full capacity.

To handle the spike, the team has to quickly scale the system. They start adding more servers, changing queue limits, and closely watching the system for errors.

This process can take hours and usually needs senior engineers to jump in immediately. It also causes a sudden increase in cloud costs that wasn’t planned in the budget. If the new feature continues to get high usage, the infrastructure has to be permanently scaled up, which means the higher cost becomes the new normal.

In the end, the team may spend two or three days dealing with infrastructure issues, right when they should have been focused on improving and supporting the new feature.

The Managed API Response

Processing volume suddenly increases, but the infrastructure automatically handles the extra load. The engineering team doesn’t need to wake up for alerts or manually scale the system.

Costs simply increase based on usage, which is expected and easier to plan for. Meanwhile, the team can stay focused on improving the product and supporting the feature that caused the sudden growth.

This example shows why risk tolerance is important when choosing between building and buying. Many teams also see the performance benefits of a specialised service when heavy processing tasks are offloaded instead of being handled inside the application stack.

An in-house pipeline assumes you can predict demand accurately and scale ahead of time. A managed API assumes that paying for usage is cheaper than handling the risks of over-provisioning infrastructure, hiring more engineers, and responding to incidents.

Situations like this are why engineering leaders need a clear framework for deciding whether to build or buy.

Strategic Decision Framework for Engineering Leaders

Not every organisation should move file processing to a managed API. The right choice depends on your product, team, and long-term goals.

To make the decision clearer, engineering leaders can start by asking four key questions that help evaluate whether building or buying makes more sense for their situation.

The following framework helps evaluate when building a pipeline makes sense and when a managed API is the better choice.

If most of your answers point toward using a managed solution, the next step isn’t choosing a vendor right away. The next step is to build a clear business case using real numbers.

Start by comparing your current cost per operation with the pricing of managed APIs. Also include the engineering time you would save if your team no longer had to maintain the processing pipeline.

Then put everything together in a 3-year total cost comparison and present it to your leadership team. This helps show the real financial impact of the decision.

Ready to calculate your real costs?

Filestack’s solutions architects can help you build a custom TCO analysis based on your actual workload, not generic estimates or assumptions.

Schedule a Custom TCO Analysis →

You can also download the Enterprise File Processing Evaluation Checklist to begin your internal evaluation.

Ultimately, the decision comes down to how your organisation wants to treat file processing infrastructure.

Conclusion: File Processing Is a Utility, Not a Feature

A helpful way for technical leaders to think about file processing is this: it’s a utility. Similar to electricity or network bandwidth, it’s an important infrastructure that your application depends on. But it’s usually not the reason users choose your product.

Optimising this infrastructure for cost, reliability, and scalability is a valid engineering challenge. At the same time, it’s important to recognise when maintaining it internally starts costing more than the control it provides.

The compute cost of file processing at scale is real, measurable, and often higher than teams initially expect. The framework in this article helps you estimate those costs more clearly. What you decide to do with that information becomes the strategic decision.

This article was published on the Filestack blog.

DEV Community