Piotr Paradiuk

Posted on Aug 7

Scaling Local Automations Without Losing Control

#automation #systemdesign #platformengineering #productmanagement

TL;DR

Automation systems must choose a mutability and control model.
Loose models are fast and flexible, but can override user intent and burn compute quickly under load.
There are semi-controlled alternatives that maintain intent and respect platform boundaries.

Intro

In this post, I explore the design of an automation engine — specifically one focused on operating on local entities.

The goal is to examine how different architectural choices affect user control, execution integrity, and system performance —especially under scale.

The Automation Gold Rush

We’re in the middle of an automation gold rush—or as I called it in Automation at Scale: Why ‘One Item at a Time’ Is Breaking Down, the automation computation race.

Platforms like Zapier, Make, Monday.com, and others are battling it out to win users with the promise of ever more automation tokens—executions per month. The result? An escalating optimization race, where new players must offer tens of thousands of automations per plan just to be competitive.

Some platforms are already falling behind. Take ClickUp, for example—its non-enterprise plans offer under 1-5k automation runs per month, while Monday.com offers 25,000 at a similar price point.

That’s great for customers — scale has never been more accessible.

For developers and platform architects, it means rising pressure to build smarter engines, more efficient schedulers, and cost-aware infrastructure.

Scope and Framing Matter

When designing systems, there’s a trap we all fall into:

Trying to solve everything at once — retries, APIs, audit logs, cross-region state, failovers.

Those all matter — but not at the start.

Think of the system like a massive whiteboard that doesn’t fit on one screen. To make progress, you need a minimap—a way to zoom in on the part that matters right now, without getting distracted by everything else.

In this post, we’ll focus on one specific layer of the system:

A self-contained, local-only automation engine.
Built to resolve conflicts between users and automations.
Designed to scale up cleanly — not get tangled in global queues.
External triggers? Future work. We're zoomed in for now.

Fast Track vs. Slow Track

From the start, we’ve drawn a clear line between local and external executions.

This distinction isn't just technical—it's philosophical.

External executions come with significant overhead: failover policies, authentication layers, networking dependencies, rate limiting, retries, monitoring, and more. Designing them well requires a different focus.

In contrast, local executions are where the platform has full control: integrity is managed by the domain model itself, latency is minimal, and automation can provide immediate feedback to the user. These are the "fast track" paths — our focus.

Even if we later support external actions, the kernel should still operate independently — as a self-contained engine inside the platform.

Designing the Kernel

Let’s zoom in on the kernel — the beating heart of the system.
It listens for changes, matches relevant automations, and applies them safely.

It assumes that:

Both users and automations can mutate the same item.
These mutations may overlap or conflict.
Our job is to resolve that gracefully, without external queues.

Let's sketch a conceptual draft:

The highlighted section (in red) shows the primary “blood flow” of our automation engine.

Execution starts with a change — from the UI or an automation. A trigger listener catches it and hands it to the engine.

The engine finds matching automations and creates intents — lightweight plans for what will happen. These don't run right away. A scheduler steps in to merge, deduplicate, and throttle them in a short time window. This smooths out bursts and avoids waste.

Once ready, each intent goes through a state manager, which compares it to the current state. If the change is valid, it’s committed to local storage.

This is our execution core.

Here’s how this flow looks in action, using a CodePen demo:

Three Control Models

To further ground this discussion, let’s use the example of Monday.com, a task management system where automations revolve around “items” (essentially structured tasks). Think of items as interactive graph nodes for users: they carry state, history, and serve as the control points of the system.

We can imagine three models for how Monday-like automation might behave:

1. High-control – A strict, deterministic model where automation executions are immutable once started. Think: banking transactions. If a transfer begins, its execution path is locked—even if the automation definition changes later. The old run follows the old logic.

2. Loose-control – Automations simply mimic user actions. Everything is mutable. If two automations change the same item at the same time, the system doesn’t enforce order or consistency. Power and responsibility lie entirely with the user.

3. Semi-control – A hybrid. Some operations compete and override each other. Others are locked or atomic. The user can choose which logic to enforce where.

It's worth considering: which of these models best fits real-world use? And how would the answer change if we scoped it to only local, fast-path operations?

What Monday.com Actually Does

Monday.com today leans toward Loose-control.

You get:

Automations (available on all plans)
Workflow Builder(available only in Enterprise plan)

These tools look similar conceptually, but they differ in execution design. I’ve written a deeper breakdown in The Product Tension Between Automations and Workflows in Monday.com.

Key difference:

Automations run per-item, with an execution delay of ~1–2 seconds - as documented in Infinite Cycles in Monday.com Automations.
Workflow Builder executes broader item flows from start to finish.

Why the Delay Exists?

Because the system is highly mutable — like a shared task list where multiple agents can reorder or change the status of any item at any time.

Each agent (automation or user) is allowed to queue an action. Automation actions are throttled and delayed, typically executing after ~1-2 seconds.

Some changes overwrite previous ones (e.g. setting status to “Done” replaces “Stuck”). Other changes add up, like moving a task repeatedly.

As a consequence, no one has absolute control — all changes are respected, just not immediately.

This makes the system intuitive for users — changes feel like they’re "live".

In this model, in case of conflict resolution - last action wins. That’s reasonable when all actions represent clear intent.

A Real-World Conflict Example

Let’s say we've got an automated scenario with the following flow:

If status is New, change it to Working on it
If status is Working on it, change it to Testing
If status is Testing, change it to Stuck

But what if someone marks it “Done” manually during step 2?

It gets overridden. That creates confusion, especially when automation volume increases.

This is where loose-control systems show their limits.

You can see the override clearly in the activity log:

Key takeaway: once an automation starts, the trigger cascade finishes even if you try to step in.

Where Loose Control Breaks Down

The loose model breaks down in two key cases:

1. Control workflows

Think of something like user onboarding. Once a task starts, you don’t want updates to interfere with it. The flow must be protected—locked or isolated—until it completes.

2. Execution caps

Let’s say our plan includes only 1,000 automation runs per month. If the automations fire instantly—reacting to every user action — they can burn through that quota surprisingly fast.

In highly mutable systems, where both users and automations can update the same entity rapidly, it’s worth considering scheduling - as a way to preserve intent and avoid waste.

Toward a Semi-Control Model

Let’s evolve the kernel. In a semi-control model:

We add an isLocked flag to key automations.
When triggered, the item becomes read-only to users.
A lock icon appears to explain why.
Once automation completes, the lock is lifted.

The semi-control scenario allows us to go further:

Not all automations need to run immediately. Locking intentionally can help us optimize performance.

For example, if we're 90% into our quota, we can slow execution, delay triggers, or group updates into a batch — as explored in Automation at Scale: Why ‘One Item at a Time’ Is Breaking Down.

In such cases, we can intentionally slow down execution — stretching it from 1–2 seconds to longer intervals — or even allow users to choose between real-time and scheduled execution.

This saves compute without breaking logic.

Wrapping Up

I hope this was a fun and hands-on way to explore the automation design.

Takeaway: We don’t need to solve for every edge case on day one.

Start with fast, local flows. Pick the right control model. Respect user intent.

Imagine if RFCs for platforms came with playful, interactive design snippets like these—not just showing what the system should do, but why it behaves the way it does.

Who knows - Maybe the next gen of RFCs won’t just explain systems — they’ll let us experience them.

What We Learned

Local-first kernels keep systems responsive and predictable.
Semi-control gives us both speed and safety.
Intent-driven scheduling reduces wasted work.
The right control model is a balance — not too rigid, and not too loose.

DEV Community