DEV Community

loading...

Scaling down periodic tasks on Google Compute Engine.

drbearhands profile image DrBearhands ・8 min read

In this tutorial I'm going to explain how to run any periodic or event-driven task on Google Cloud Platform, without having to pay for resources while your task is not running. As the title says, I've settled on using Compute Engine for this. This tutorial can also serve as an introduction to Compute Engine and Google Cloud Platform in general.

Motivation (Optional)

I'm building a website (jobsort), that scrapes software engineering vacancies from a bunch of job boards and derives symbolic information from them using AI. While the webserver needs to be always-on, the scraper and AI systems are only used periodically. I want to avoid waste, out of both financial and moral concerns, so I want to scale this service down to 0 when it's not in use.

I found scaling down resources when a task is not running to be much harder and time-consuming to learn than I expected, so I decided to share my findings by writing this tutorial.

Available tools (Optional)

Before diving into specifics I'd like to list the relevant tools available on Google Cloud Platform. If you're familiar with GCP you can skip this section.

  • App Engine Standard: a light-weight runtime that lets your scale up and down dramatically. It even scales down to 0, relaunching your app when requests come in. By itself, App Engine Standard was far too limiting for my requirements (no native libraries, high memory-usage incorrectly identified as memory leaks, restricted programming language support...)
  • App Engine Flexible: essentially a few quality of life improvements over perpetually running a single container. Unfortunately it cannot scale down past 1 instance.
  • Kubernetes: Runs containers. Kubernetes can scale up and down automatically but requires at least one active node to run the auto-scaler. That said, if you're already using Kubernetes and paying the one-active-node price, Kubernetes with auto-scaling should work fine for you.
  • Compute Engine: Runs VMs. Compute Engine cannot really scale by itself but has a decent API that allows you to control your instances directly.
  • Cloud Functions: Lets you run specific functions rather than a whole webserver. Cloud Functions is too limiting for my requirements and I expect they will be as well for many others.

Furthermore, we can create Cron Jobs that call URLs on a schedule. To use them, we need an app that can listen for http traffic while scaled down to 0 (sort of, we're just letting Google handle it for us).

I will assume that the task you wish to run is non-trivial and requires a bit more than App Engine Standard or Cloud Functions can offer by themselves. If not, just go with one of those.

Overview

To get around the limitations mentioned previously, we'll create a system that does the following:

  1. Call a task starter on App Engine Standard using a Cron Job. You could also use some other starting trigger but that won't be covered by this tutorial.
  2. Have the task starter start a new VM instance in Compute Engine
  3. On startup, The VM instance will perform the desired task and shut itself down afterward

That's the short of it, in the following sections I'll go into specifics and solutions to some problems you might run into.

I will link to Google's own documentation both for further reference and in case this information becomes outdated without me realizing it.

Step 1: Creating a VM Instance

First, we're going to create the VM Instance that runs our task.

Assuming you have selected the right project, enabled billing and enabled the Compute Engine API (see the quickstart if that's not the case), go to the VM instances page and create a new instance.

Pick whatever name, region, zone, machine type and boot disk makes sense for your situation. I'll be using a Linux VM (Debian GNU/Linux 9) in this tutorial. You can pick whichever OS you want but some parts of this tutorial will be OS-specific and you will have to adapt them to your OS of choice. You will not need incoming http(s) traffic for this tutorial, provided your task does not require it.

Step 2: Set up your VM

Google's quickstart now suggests SSH-ing into the VM. Before you do however, there is a caveat you should know about, detailed in this stackoverflow question. Essentially, connecting through the GCP console will create a user using your GCP account name, while connecting through your machine's terminal (e.g. using gcloud compute ssh) will use your machine's username. This makes sense as each process uses whatever information is available to it, but not realizing this is how that works can lead to puzzling errors later on.

ssh button

That said, we're going to click the nice shiny SSH button on the GCP console anyway for this tutorial. In the window that pops up you can issue commands as you would on a terminal and the icon in the top-right corner lets you upload files/directories to your home directory. For more options see this page about file-transfers to VMs and the gcloud compute ssh documentation.

You should now set up your system so it can run the task you need it to. Give it a test run while ssh-ed into your VM to ensure everything works.

Step 3: Starting up and shutting down

Now that your VM instance is set up, it's time to create the startup script that will execute your task on startup. Click on your instance from the instances list in the Console and click EDIT near the top. Under the Custom metadata section, add a key called "startup-script" (no quotes). In the value textarea, write a script that will run your task. For Debian, you startup script might look something like this:

#! /bin/bash
sudo su <USERNAME> <<EOF
set -e
cd ~/<PROJECT DIRECTORY>
./<RUN_TASK.sh>
EOF
sudo shutdown now -h

Of course, all upper case names within <> are replaced with the project's details.

The startup script will be executed as root, rather than the user with which you ssh-ed into the VM. Hence why we execute the script as another user, using sudo su <USERNAME> <<EOF ... EOF. For more information on that specific command see this stackoverflow question.

The sudo shutdown now -h line will shut down the machine after the task has been completed.

If you've previously chosen a different OS you might need to use a different command to shut down the Instance, listing them all is beyond the scope of this tutorial.

Step 4: Starting the task on a schedule

The VM will now execute the task and shut itself down whenever it starts. We will now use App Engine Standard to start the VM based on Cron Job http requests.

The App Engine Standard app is just 2 files you need to copy, replacing names and messages to be more specific for your project. You can change filenames as you wish, so long as they reside in the same directory. Note that there should either be no other files in the same directory or you should use a .cloudignore file. Otherwise you might end up accidentally uploading things you don't want uploaded.

The file runner.go has the following code:

package main

import (
  "fmt"
  "net/http"
  "google.golang.org/appengine"
  "google.golang.org/api/compute/v1"
  "golang.org/x/oauth2/google"
)

func main() {
  http.HandleFunc("/", start_task)
  appengine.Main()
}

func start_task(w http.ResponseWriter, r *http.Request) {
  if r.Header.Get("X-Appengine-Cron") == "true" {
    ctx := appengine.NewContext(r)

    c, err := google.DefaultClient(ctx, compute.CloudPlatformScope)
    if err != nil {
      http.Error(w, fmt.Sprintf("Could not create OAuth2 client: %s", err), 500)
      return
    }

    computeService, err := compute.New(c)
    if err != nil {
      http.Error(w, fmt.Sprintf("Could not create compute service: %s", err), 500)
      return
    }

    project := <PROJECT_ID>
    zone := <PROJECT_ZONE>
    instance := <VM_INSTANCE_NAME>

    if _, err := computeService.Instances.Start(project, zone, instance).Context(ctx).Do(); err != nil {
      http.Error(w, fmt.Sprintf("Failed to start instance: %s", err), 500)
      return
    }

    w.Write([]byte("Task started!"))
  }
}

Replace <PROJECT ID>, <PROJECT_ZONE> and <VM_INSTANCE_NAME> with their corresponding values.
This file just creates a server that listens for http requests originating from Cron Jobs and starts the VM when it receives one. As background knowledge, there's references about the compute API in general - which I wouldn't advice reading start-to-finish - and starting an instance specifically (scroll down for the library used in the code above).

runner.yaml contains the following:

runtime: go
api_version: go1
service: <MY SERVICE NAME>

handlers:
  - url: .*
    script: _go_app

Replace <MY SERVICE NAME> with whatever you wish to call this service, but remember that this service is not your task, it merely starts your task.

This file just tells gcloud that this is a Go service to be deployed in App Engine Standard and how requests to this service should be handled.
There's a reference for app.yaml files if you need more information about what this file does.

Provided you have installed the required development tools. You can test the service by temporarily commenting out the if r.Header.Get("X-Appengine-Cron") == "true" conditional, running dev_appserver.py runner.yaml and browsing to localhost:8080. This should now start your VM Instance. Remember to uncomment the conditional as failing to do so will allow any old troll to start your VM whenever they feel like (Google strips the header on requests from outside your project). If everything works, deploy the runner:

gcloud app deploy runner.yaml

Step 5: Creating a Cron Job

Just one more thing to do: creating the Cron Job that calls the app you just deployed to App Engine Standard.
Create a cron.yaml file that looks like this:

cron:
- description: "<JOB DESCRIPTION>"
  url: /
  target: <MY SERVICE NAME>
  schedule: every 24 hours

The target should be the name you specified for your service inside runner.yaml.

Note that you have one cron.yaml file for your entire project, so ensure you're appending to any that already exist rather than creating a new one, or you might end up deleting all running Cron Jobs.

Now all you just need to deploy the cron file.

gcloud app deploy cron.yaml

After the deployment is completed, find your Cron Job in the console and test it.

That's it! You task will now be execute on a regular schedule without consuming resources unnecessarily.

Conclusion

It can take some effort to run a task on GCP without consuming resources unnecessarily. I personally believe this should be made easier. It took me nearly 3 days to figure everything out, that was after various dead ends with Kubernetes and App Engine Standard/Flexible. Hopefully this tutorial made it much easier for you.

Discussion (4)

pic
Editor guide
Collapse
achrafboussaada profile image
Achraf Boussaada

In my case i need to deploy two cron jobs. But i couldn't figure out how to set the url in cron.yaml and the handlers in the other file correctly. Also i'm using a bucket to store the script that's going to be executed. Somehow the cronjobs could not differentiate between the two scripts and endup using only one for both. Any idea how to organize my bucket or how to set the handlers the right way ?

Keep up the good work.

Collapse
drbearhands profile image
DrBearhands Author

Just have 2 VMs and 2 different appengine endpoints. I can't really help with the buckets as I haven't used that API yet. I'm curious though what your use-case for buckets is, I never found any use for them (explicitely) myself.

Collapse
achrafboussaada profile image
Achraf Boussaada

Hi again,

my use case doesn't involve VMs. I need to send push notifications and update my database periodically.That's why i need two separate cron jobs and i'm using a bucket to store the scripts responsible for doing the work. Is there an other approach to do this ? or does my app.yaml file need to contain a script for both jobs ? i read the documentations multiple times and couldn't fully understand how this work.

Thread Thread
drbearhands profile image
DrBearhands Author

This was about using VMs to scale down. I don't know every single feature of GCP from memory. Maybe you'll have more luck on a GCP slack channel?