DEV Community

Cover image for How to Avoid an Unexpected Cloud Bill — Fully Automated
Darren "Dazbo" Lester for Google Developer Experts

Posted on • Originally published at Medium on

How to Avoid an Unexpected Cloud Bill — Fully Automated

Do These Headlines Scare You?

Student hit with a $55,444.78 Google Cloud bill after Gemini API key leaked on GitHub
Huge bill
How a 4-Hour Overnight Fine-Tune for My Boyfriend Became a £11,550 Google Cloud Bill — and What I Learned
A huge BQ bill
“Got hit with a €50,000 ($58,000) bill from BigQuery after 17 test queries”
Another BQ bill

They Terrify Me!

One tiny mistake and you could find yourself owing thousands to your cloud provider.

One tiny mistake

The examples above show various scenarios of how this could happen:

  • You accidentally leak an API key on GitHub, and someone steals it.
  • You run a bunch of SQL on Google BigQuery, but you have no idea that these queries might be costing thousands.
  • You run some fine tuning on a foundation model, not realising this could cost you thousands overnight.

And there are many other ways this could happen:

  • You setup an autoscaling Google Cloud Run, GKE or GCE service to run your Internet-facing application. You don’t expect much traffic. But then, your application goes viral and gets millions of hits.
  • Or perhaps some script kiddie decides to make you their victim, and sets up a botnet to attack your site.

All of these scenarios could result in a huge bill. Some might be a result of making a mistake; or not having enough knowledge; or not building in enough security. But not everyone is an expert. And even for those who are, one little mistake shouldn’t carry the potential of this kind of life-changing bill.

On Google Cloud, Setting a Budget Won’t Save You

Google Cloud Billing budgets are used to track your spend and to send you alerts when your spend exceeds thresholds set by your budget.

From the Google Cloud documentation:

Budgets let you track your actual Google Cloud costs against your planned costs. After you’ve set a budget amount, you set budget alert threshold rules that are used to trigger email notifications. Budget alert emails help you stay informed about how your spend is tracking against your budget. You can also use budgets to automate cost control responses.

But here’s the kicker: budgets are not a hard limit. When budget is exceeded, Google Cloud will not stop you from consuming your cloud resources.

In fact, there’s no simple out-of-the-box way to set hard limits on your projects, based on spend. But you can do it programmatically.

Here I’ll share solution that you can freely download and use yourself with very little effort.

The Google Cloud Billing Kill-Switch

My solution architecture looks like this:

Billing Killswitch — Architecture

It works like this:

  1. You create your budget and budget alerts. This is where we’d typically setup alerts to trigger when, for example, you consume 50%, 90%, and 100% of your budget.
  2. You configure your budget alerts to send a notification to a Pub/Sub topic.
  3. A Cloud Run Function subscribes to this topic. When the message arrives on the topic, the function is triggered.
  4. The function reads the message. If the message says that the budget has been exceeded, then the function retrieves all projects associated with that budget.
  5. Finally, the function uses the Cloud Billing API to disconnect these projects from the billing account.

No connected billing account = no more cost!

Application Deployment

My repo includes instructions for how to deploy the application. The deployment process does the following:

  1. Enables any necessary APIs in the application-hosting project.
  2. Creates a Pub/Sub topic that budget alerts will be sent to.
  3. Allows Cloud Billing to publish to the topic.
  4. Creates a service account for the Cloud Run Function called cf-billing-killswitch-sa@<your-project>.iam.gserviceaaccount.com. And then adds the minimum set of roles required to this service account.
  5. Deploys the Cloud Run Function itself, which will run under the service account.

Deployment Decision: Centralised or Distributed?

The main decision you need to make is: Where you will deploy the Pub/Sub topic and the Cloud Run Function?

If you only intent to use this mechanism with just one or two projects (which I will call “monitored projects”) then it’s fine to deploy the components directly into each of those projects.

But once you get beyond a couple of monitored projects, I would strongly recommend creating a FinOps-Admin project, where you will host the topic and function. That way, you only need to deploy the application once. And you simply connect your monitored projects by setting up budget alerts to go to this central Pub/Sub topic.

Deployment Steps

These are covered in the repo README. Briefly, the process is as follows…

First, clone the repo:

git clone https://github.com/derailed-dash/gcp-billing-killswitch.git
cd gcp-billing-killswitch
Enter fullscreen mode Exit fullscreen mode

Then setup your .env file, by substituting the parameters required. For example, if you’ve created a FinOps project, you would substitute <your-hosting-project> with its project ID.

export PYTHONPATH="src"
export DEV_GOOGLE_CLOUD_PROJECT="<your-test-project>"
export GOOGLE_CLOUD_PROJECT="<your-hosting-project>"
export GOOGLE_CLOUD_REGION="<your-region>"
export FUNCTION_NAME="cf-billing-killswitch"
export BILLING_ALERT_TOPIC="budget-alerts"
export BILLING_ACCOUNT_ID="<your-billing-account>"
export LOG_LEVEL="INFO" # Set to DEBUG for development
export SIMULATE_DEACTIVATION="false" # Set to true for simulation mode

# For testing purposes, you can find a budget ID like this:
# gcloud billing budgets list --billing-account=$BILLING_ACCOUNT_ID --project=$GOOGLE_CLOUD_PROJECT
export SAMPLE_BUDGET_ID="<your-test-budget-id>"
Enter fullscreen mode Exit fullscreen mode

Then go ahead and run the deployment commands. The final step is the creation of the Cloud Run Function itself:

# Deploy the Cloud Run Function
gcloud functions deploy "$FUNCTION_NAME" \
  --gen2 \
  --runtime=python312 \
  --project="$GOOGLE_CLOUD_PROJECT" \
  --region="$GOOGLE_CLOUD_REGION" \
  --source=./src \
  --entry-point=disable_billing_for_projects \
  --trigger-topic="$BILLING_ALERT_TOPIC" \
  --service-account="${SERVICE_ACCOUNT_EMAIL}" \
  --set-env-vars LOG_LEVEL=$LOG_LEVEL,SIMULATE_DEACTIVATION=$SIMULATE_DEACTIVATION
Enter fullscreen mode Exit fullscreen mode

We can review the deployed function in the Cloud Console:

Deployed Cloud Run Function

Since this is a 2nd generation Cloud Run Function, using a Pub/Sub trigger automatically results in the creation of an Eventarc trigger and an associated push subscription to forward messages from the topic to our function.

Eventarc subscription

If you’re interested in the details, the subscription is setup with the following parameters, which you could change if you want:

  • Message retention: 1 day. This is how long the subscription will keep a message that has not been acknowledged, before binning it.
  • Acknowledgement deadline: 600 seconds. This means that if the message is not acknowledged within 10 minutes, the subscription will retry invoking the function, based on the retry policy.
  • Retry policy: retry after exponential delay with maximum of 600 seconds. This means that the subscription will keep retrying, and the amount of time between retries grows between each retry, until it reaches the maximum of 10 minutes. It will then retry every 10 minutes.

Important: Simulate Mode

When you deploy the function, you pass an environment variable called SIMULATE_DEACTIVATION. Set this true if you want to test the function without actually deactivating your project.

Creating Your Budget Alerts

This is where we configure Google Cloud to send alerts when we reach spend thresholds. We do this from within the Google Cloud Console.

Navigate to Billing → Budgets & Alerts.

Budgets & Alerts

From here, click on Create budget. Give your budget a meaningful name, and select the project(s) it will apply to. My recommendation is that each budget should only apply to one or a few releated projects. (You can set up as many budgets as you like.)

Creating a new budget

Next, set the budget amount, and then the threshold rules, e.g.

Defining alert thresholds

Finally, define what happens when these thresholds are exceeded. The crucial part for our Cloud Run Function is that the messages MUST be sent to a Pub/Sub topic:

Sending notifictions to our topic

Testing

For testing purposes, I’m starting by deploying the topic and function to a single project that will also be the monitored project. Now I can test the function by sending a fake budget alert message.

I’ve included a budget alert template in the repo, tests/budget_alert.json.template:

{
  "budgetDisplayName": "My Test Budget",
  "costAmount": 120.0,
  "budgetAmount": 100.0,
  "costIntervalStart": "2025-09-01T00:00:00Z",
  "alertThresholdExceeded": 1.0,
  "currencyCode": "GBP",
  "projectNumber": "TEST_PROJECT_NUMBER"
}
Enter fullscreen mode Exit fullscreen mode

This command will create a test message from the template provided, substituting values from your environment variables:

export TEST_PROJECT_NUMBER=$(gcloud projects describe $DEV_GOOGLE_CLOUD_PROJECT --format="value(projectNumber)")

# CREATE TEST MSG by replacing placeholders in the template using values from env vars
sed "s/TEST_PROJECT_NUMBER/${TEST_PROJECT_NUMBER}/g" tests/budget_alert.json.template > tests/budget_alert.json
msg=$(cat tests/budget_alert.json)
Enter fullscreen mode Exit fullscreen mode

Now retrieve the budget ID for the budget we created earlier:

Get the budget ID

Important: if you are not using SIMULATE_DEACTIVATION, when you send the message your project will get disconnected!

Use the budget ID in the following command, which will send the test budget alert to your topic:

export SAMPLE_BUDGET_ID=<budget_id>

gcloud pubsub topics publish $BILLING_ALERT_TOPIC \
  --project="$GOOGLE_CLOUD_PROJECT" \
  --message="$msg" \
  --attribute="budgetId=$SAMPLE_BUDGET_ID,billingAccountId=$BILLING_ACCOUNT_ID"
Enter fullscreen mode Exit fullscreen mode

Now we can check that the function has triggered. Navigate to Cloud Run Functions in the Console:

Cloud Run Functions

Click on the service:

Service

Open the Logs view:

Logging

It works! Now it’s time to test with simulation mode turned off. Update the SIMULATE_DEACTIVATION environment variable to False and then redeploy the Cloud Run Function. You should now have a new revision that looks like this:

Simulation mode now turned off

Let’s send our test message again:

gcloud pubsub topics publish $BILLING_ALERT_TOPIC \
  --project="$GOOGLE_CLOUD_PROJECT" \
  --message="$msg" \
  --attribute="budgetId=$SAMPLE_BUDGET_ID,billingAccountId=$BILLING_ACCOUNT_ID"
Enter fullscreen mode Exit fullscreen mode

And now try to look at the logging for my function and…

No permission to view this project

And that’s because…

No billing account

RESULT! The billing account has been unlinked.

High Five!

Of course, we can still look at the logs in Google Cloud Logging:

Logs showing billing disabled

Re-Enabling Billing

From here I can easily re-enable billing by clicking on Link a billing account, and reattaching my billing account.

Interested in the Function Code?

Here it is:

"""
This Google Cloud Function is designed to automatically disable billing for any Google Cloud projects associated with an exceeded budget.

It is triggered by a Pub/Sub message, which is published by a Cloud Billing budget alert.
When a project's spending exceeds a defined threshold, the alert is sent, and this function is invoked.
The function parses the incoming Pub/Sub message to identify the associated project(s) and then uses the Cloud Billing API to detach the project from its billing account, effectively disabling billing.

The Pub/Sub message is expected to have the following format:
- **Message Payload (JSON):**
    - `costAmount` (float): The amount of cost that has been incurred.
    - `budgetAmount` (float): The budgeted amount.
- **Message Attributes:**
    - `billingAccountId` (str): The ID of the billing account.
    - `budgetId` (str): The ID of the budget.

**⚠️ Warning: This is a destructive action.**
Disconnecting a project from its billing account will stop all paid services.
"""
import base64
import json
import logging
import os
import functions_framework
import google.cloud.logging
from cloudevents.http.event import CloudEvent
from google.api_core import exceptions
from google.cloud import billing_v1
from google.cloud.billing.budgets_v1 import BudgetServiceClient

log_level = os.environ.get("LOG_LEVEL", "INFO").upper()
log_level_num = getattr(logging, log_level, logging.INFO)

# Configure a Cloud Logging handler and integrate it with Python's logging module
logging_client = google.cloud.logging.Client()
logging_client.setup_logging(log_level=log_level_num)

app_name = "billing-killswitch"
logger = logging_client.logger(app_name)

billing_client = billing_v1.CloudBillingClient()
budget_client = BudgetServiceClient()

@functions_framework.cloud_event
def disable_billing_for_projects(cloud_event: CloudEvent):
    """
    Cloud Function to disable billing for projects based on a Pub/Sub message from a billing alert.
    """
    logging.debug(f"Function {app_name} invoked from Pub/Sub message.")

    # The Pub/Sub message is base64-encoded
    message_data = base64.b64decode(cloud_event.data["message"]["data"]).decode("utf-8")
    message_json = json.loads(message_data)
    attributes = cloud_event.data["message"]["attributes"]

    logging.debug(f"Pub/Sub message attributes: {attributes}")
    logging.debug(f"Pub/Sub message data: {message_data}")

    budget_name = message_json["budgetDisplayName"]
    cost_amount = message_json["costAmount"]
    budget_amount = message_json["budgetAmount"]

    # Only disable billing if the cost has exceeded the budget
    if cost_amount <= budget_amount:
        logging.info(f"Function {app_name}, {budget_name}: "
                     f"{cost_amount} has not exceeded budget {budget_amount}. No action taken.")
        return

    # Get the budget ID and billing_account_id from the message attributes
    budget_id = attributes.get("budgetId", "")
    billing_account_id = attributes.get("billingAccountId", "")

    if not billing_account_id:
        logging.error(f"Function {app_name}: No billingAccountId found in message payload.")
        return

    if not budget_id:
        logging.error(f"Function {app_name}: Function: No budgetId found in message attributes.")
        return

    logging.info(f"Function {app_name}, {budget_name}: {cost_amount} has exceeded budget {budget_amount}.")

    try:
        # Use the budget ID to get the budget details
        full_budget_name = f"billingAccounts/{billing_account_id}/budgets/{budget_id}"
        budget = budget_client.get_budget(name=full_budget_name)
    except Exception as e:
        logging.error(f"Function {app_name}: Error getting budget details: {e}")
        return

    # The budget filter contains the projects the budget is scoped to
    if not budget.budget_filter or not budget.budget_filter.projects:
        logging.warning(f"Function {app_name}: {budget_name} is not scoped to any projects. No action taken.")
        return

    # Get all projects associated with this budget
    project_ids = [p.split("/")[1] for p in budget.budget_filter.projects]

    for project_id in project_ids:
        project_name = f"projects/{project_id}"
        if _is_billing_enabled_for_project(project_name):
            # billing might already be disabled
            logging.info(f"Function {app_name}, {budget_name}: Disabling billing for {project_id}...")

            # Check for simulation mode
            simulate_deactivation = os.getenv("SIMULATE_DEACTIVATION", "false").lower() == "true"
            if simulate_deactivation:
                logging.info(f"SIMULATION MODE: Billing would have been disabled for project {project_id} for budget {budget_name}.")
            else:
                _disable_billing_for_project(project_name)
        else:
            logging.info(f"Function {app_name}, {budget_name}: Billing is already disabled for project {project_id}.")

def _is_billing_enabled_for_project(project_name: str) -> bool:
    """Determine whether billing is enabled for a project.

    Args:
        project_name: Project to check, with the format 'projects/<project_id>'.

    Returns:
        Whether project has billing enabled or not.
        In the event that an exception occurs when retrieving the billing info,
        assume that this is a result of billing already being disabled.
    """
    try:
        logging.debug(f"Function {app_name}: Getting billing info for project '{project_name}'...")
        response = billing_client.get_project_billing_info(name=project_name)
        return response.billing_enabled
    except Exception as e:
        logging.warning(f"Function {app_name}: Unable to get billing info for project {project_name}."
                        f"This could happen if the project is already disconnected from the billing account. "
                        f"Assuming billing is disabled."
                        f"Error message: {e}")
        return False

def _disable_billing_for_project(project_name: str) -> None:
    """Disable billing for a project by removing its billing account.

    Args:
        project_name: Project to disable billing for, with the format 'projects/<project_id>'.
    """
    # Find more information about `updateBillingInfo` API method here:
    # https://cloud.google.com/billing/docs/reference/rest/v1/projects/updateBillingInfo
    try:
        # To disable billing set the `billing_account_name` field to empty
        project_billing_info = billing_v1.ProjectBillingInfo(billing_account_name="")
        billing_client.update_project_billing_info(name=project_name, project_billing_info=project_billing_info)
        logging.info(f"Function {app_name}: Successfully disabled billing for project {project_name}")
    except exceptions.PermissionDenied as e:
        logging.error(f"Function {app_name}: Failed to disable billing for {project_name}, check permissions: {e}")
    except Exception as e:
        logging.error(f"Function {app_name}: Error disabling billing for project {project_name}: {e}")
Enter fullscreen mode Exit fullscreen mode

It’s pretty self-explanatory. I start by setting up the Cloud Logging client, passing in the logging threshold from my environment variable. (By default, the client will log everything at INFO and above.)

The function receives the billing alert message in the CloudEvent payload. The data is in JSON format. We extract the budget limit and current spend from the data payload. And we extract the budget ID and billing account ID from the message attributes.

Remember that a budget alert can be triggered at different thresholds. We only need to disconnect projects if the current spend exceeds the budget. If so, we extract all the projects associated with this budget, iterate over them, and for each: disconnect the billing account. Simple!

One Last Word of Caution

Budget alerts do not fire immediately after a spend threshold is crossed. Typically a budget alert will fire within about 20 minutes of a threshold being crossed, but it can sometimes take longer.

How to Generate Real Spend

To test the function using real spend to drive an alert, I needed a way to create spend. I’ve found that one of the fastest ways to do this is to create a couple of Veo3 videos from Vertex AI. For example, generating this video in Veo3 Flash costs about $0.60:

Generated with Veo3

Alternative Ways to Deploy the Cloud Run Function

Earlier I showed you how to deploy the Cloud Run Function with the gcloud function deploy command, which implicitly sets up the Eventarc trigger. But with the rebranding of Cloud Functions to Cloud Run Functions, Google are transitioning towards treating gcloud run deploy as the primary deployment tool for Cloud Run and Cloud Run Functions.

If we want to use gcloud run deploy, the deployment process now looks like this:

# As before, make sure the service account variable is defined
export SERVICE_ACCOUNT_NAME="${FUNCTION_NAME}-sa"
export SERVICE_ACCOUNT_EMAIL="${SERVICE_ACCOUNT_NAME}@${GOOGLE_CLOUD_PROJECT}.iam.gserviceaccount.com"

# Create the Cloud Run Function
# Re-run this command for any changes to your function
gcloud run deploy $FUNCTION_NAME \
  --base-image=python312 \
  --project=$GOOGLE_CLOUD_PROJECT \
  --region=$GOOGLE_CLOUD_REGION \
  --source=./src \
  --function=disable_billing_for_projects \
  --no-allow-unauthenticated \
  --execution-environment=gen1 \
  --cpu=0.2 \
  --memory=256Mi \
  --concurrency=1 \
  --max-instances=1 \
  --service-account="${SERVICE_ACCOUNT_EMAIL}" \
  --set-env-vars LOG_LEVEL=$LOG_LEVEL,SIMULATE_DEACTIVATION=$SIMULATE_DEACTIVATION

# Create the Eventarc trigger, wiring the topic to the function
gcloud eventarc triggers create ${FUNCTION_NAME}-trigger \
  --project=$GOOGLE_CLOUD_PROJECT \
  --location=$GOOGLE_CLOUD_REGION \
  --destination-run-service=$FUNCTION_NAME \
  --destination-run-region=$GOOGLE_CLOUD_REGION \
  --event-filters="type=google.cloud.pubsub.topic.v1.messagePublished" \
  --transport-topic=projects/$GOOGLE_CLOUD_PROJECT/topics/$BILLING_ALERT_TOPIC \
  --service-account=$SERVICE_ACCOUNT_EMAIL
Enter fullscreen mode Exit fullscreen mode

A few things to note about my Cloud Run Function deploy command:

  • By default, the gcloud run deploy command will deploy a second generation function. But our function is more suited to a 1st generation function, i.e. a very small function that frequently spins up from 0. So I’ve set execution-environment=gen1.
  • To make this a function, we use the parameter --function and pass it the function entrypoint.
  • A second generation function has a minimum CPU allocation of 1 vCPU. With 1st gen we can allocate fractional CPUs. I’ve allocated 0.2 CPUs here. Note that in order to specify fractional CPUs, you also need to set the concurrency to 1.
  • A second generation function has a minimum memory alloction of 512MB. With 1st gen we can go smaller. Here I’ve allocated 256MB.

Wrap Up

I hope that was useful and perhaps helps you sleep easier as you consume those Google Cloud resources!

If you use this killswitch, I’d love to hear about it. Please add a star to the GitHub repo. Feel free to raise issues and to contribute to the repo.

See you next time!

You Know What To Do!

  • Please share this with anyone that you think will be interested. It might help them, and it really helps me!
  • Please give me claps! (Just hold down the clap button.)
  • Feel free to leave a comment 💬.
  • Follow and subscribe, so you don’t miss my content.

Useful Links and References

Top comments (0)