Glenn Bostoen

Posted on Jan 26

The Surprising Simplicity of Temporal Worker Pools on Cloud Run

#googlecloud #python

If you've ever spent an afternoon debugging indentation errors in Google Workflows YAML only to discover the real problem was a cryptic ${} expression, you'll understand why we made the switch to Temporal. What we didn't expect was just how simple the deployment would be.

The problem we were solving

Our workflow orchestration setup had all the classic symptoms of YAML-based configuration debt:

Verbose definitions: Representing simple workflows as graphs required dozens of steps and connector definitions
Cold start delays: Every workflow step triggered a Cloud Run job, adding 35-70 seconds of spin-up time per execution
No local testing: Changes required deployment to validate. The feedback loop was measured in deploys, not seconds
Split infrastructure: Application code lived in one place, workflow definitions in another, and every change risked breaking both

The worst part? We couldn't even test locally. Every iteration meant committing, deploying, and hoping.

Why Temporal changes everything

Temporal flips the model on its head. Instead of declarative YAML that describes what should happen, you write actual code that describes how it happens:

@workflow.defn
class IndexingWorkflow:
    @workflow.run
    async def run(self, workspace_id: str):
        # This is just Python. Full IDE support.
        # Local debugging works out of the box.
        connections = await workflow.execute_activity(
            fetch_connections,
            workspace_id,
            start_to_close_timeout=timedelta(minutes=5),
        )

        for connection in connections:
            await workflow.execute_activity(
                index_connection,
                connection.id,
                start_to_close_timeout=timedelta(minutes=30),
            )

That's it. No separate YAML file. No mysterious DSL. Just code that your IDE understands, your debugger can step through, and your tests can cover. And it runs on your infrastructure. Temporal handles orchestration and state, but the actual work happens on compute you control.

The pull-based architecture

Understanding why Temporal workers are different from Cloud Run jobs unlocks the simplicity.

Cloud Run Jobs are push-based and ephemeral:

Something triggers them → they spin up → execute → shut down
Each invocation pays the cold start tax
No shared state between executions

Temporal Workers are pull-based and persistent:

Workers run on your infrastructure, not Temporal's
They maintain a long-polling connection to Temporal for orchestration
They pull tasks when they have capacity
Workers stay warm, eliminating cold starts

This pull-based model is exactly what Google designed Cloud Run Worker Pools for. It's a resource type announced at Google Cloud Next '25 specifically for continuous, non-HTTP, pull-based background processing.

Enter Cloud Run worker pools

Worker pools solve a real problem for Temporal deployments. Unlike Cloud Run Services (designed for HTTP workloads) or Jobs (designed for batch tasks), Worker Pools are purpose-built for exactly what Temporal workers do: continuously pull tasks from a queue.

Why Worker Pools are perfect for Temporal:

No HTTP endpoint required: Workers just poll Temporal. No need to expose ports or manage health check endpoints
Lower total cost: No load balancer, no HTTP endpoint overhead, just compute
Reduced attack surface: No public URL means fewer security concerns
Instance splitting: Deploy canary releases by allocating percentages of instances to different revisions

The deployment is even simpler than Services:

gcloud beta run worker-pools deploy worker \
  --image gcr.io/my-project/worker:latest \
  --region europe-west1

Or with Terraform:

resource "google_cloud_run_v2_worker_pool" "worker" {
  name         = "temporal-worker"
  location     = "europe-west1"
  provider     = google-beta
  launch_stage = "BETA"

  scaling {
    scaling_mode       = "AUTOMATIC"
    min_instance_count = 1
    max_instance_count = 5
  }

  template {
    containers {
      image = "gcr.io/my-project/worker:latest"
      resources {
        limits = {
          cpu    = "1"
          memory = "1Gi"
        }
      }
    }
  }
}

No minScale hacks. No unused HTTP endpoints. Just a container that runs your worker code.

Note: Worker Pools are currently in public preview. For production workloads, you can still use Cloud Run Services with --min-instances 1. The architecture is identical, just with a bit more overhead.

The deployment is just another container

Here's the mental shift: you're not deploying workflows anymore. You're deploying an application that happens to execute workflows.

FROM python:3.11-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .

CMD ["python", "-m", "worker"]

Your worker code:

# worker.py
import asyncio
import os

from temporalio.client import Client
from temporalio.worker import Worker

from workflows import (
    IndexingWorkflow,
    GoogleDriveWorkflow,
)
from activities import (
    fetch_connections,
    index_connection,
    sync_drive,
)

async def main():
    client = await Client.connect(
        "namespace.tmprl.cloud:7233",
        api_key=os.environ["TEMPORAL_API_KEY"],
    )

    worker = Worker(
        client,
        task_queue="indexing-queue",
        workflows=[
            IndexingWorkflow,
            GoogleDriveWorkflow,
        ],
        activities=[
            fetch_connections,
            index_connection,
            sync_drive,
        ],
    )

    await worker.run()

if __name__ == "__main__":
    asyncio.run(main())

Deploy (with your API key stored in Secret Manager):

gcloud beta run worker-pools deploy worker \
  --image gcr.io/my-project/worker:latest \
  --min-instances 1 \
  --max-instances 5 \
  --memory 1Gi \
  --cpu 1 \
  --region europe-west1 \
  --set-secrets TEMPORAL_API_KEY=temporal-api-key:latest

That's the entire deployment. No Terraform for workflow definitions. No separate infrastructure repo. Just your application container with workflow logic baked in.

Cost reality check

Let's be honest about the trade-offs:

Before (Google Workflows + Cloud Run Jobs)

Cloud Workflows: ~$2-3/month
Cloud Run job invocations: ~$35/month
Cold start compute waste: ~$40-50/month
Total: ~$80-90/month

After (Temporal Cloud + Cloud Run Worker Pools)

Temporal Cloud starter: ~€100/month (orchestration and state only)
Worker Pool (2 instances, always-on): ~$18-24/month (your compute, no load balancer or endpoint overhead)
Total: ~$120-140/month

Yes, it costs more. But here's what you get:

Developer time saved: 2-3 hours/month not fighting YAML
Execution speed: 73-78% faster workflows (no cold starts)
Local testing: Full workflow debugging before deployment
Real observability: See workflow graphs, execution history, parent-child relationships in real-time

At a loaded developer cost of €80/hour, the ROI turns positive immediately.

What simplicity actually looks like

Kill a running workflow mid-execution. Restart the worker. The workflow resumes exactly where it left off.

That's durability you'd have to build yourself with Google Workflows: tracking state in Cloud Storage, implementing retries, handling partial failures. With Temporal, it's the default behavior.

Debug a failing activity with your IDE's debugger. Set breakpoints. Inspect state. Validate fixes locally before deploying.

This is what simplicity means: removing the gap between "I think this will work" and "I know this works."

Making the switch

The migration path isn't all-or-nothing:

Spike it: Implement one workflow in Temporal, run both systems in parallel for a week
Measure: Compare execution times, reliability, developer experience
Dark launch: Run Temporal workflows in production, keep Google Workflows as fallback
Gradual rollout: 10% → 50% → 100% with rollback ready

We kept Google Workflows YAML in git history (never delete, just remove from deployment) and maintained the rollback capability for 30 days. We never needed it.

The simplicity of Temporal isn't in having fewer moving parts. It's in having the right moving parts. A persistent worker pool on Cloud Run, code-native workflow definitions, and a managed orchestration layer that handles the hard stuff.

No more YAML debugging. No more cold start delays. No more "deploy to test" cycles.

Just workflows that work.

original post: https://gbostoen.dev/blog/temporal-cloud-run/