If you ever get the need of running your own script in any language, in a scheduled manner or repetitively with specific instructions, this one is for you.
In many applications, commonly in data visualization applications, we need some piece of code to be run in fixed times, dates, or intervals. These code chunks can be making API calls, get data from a database, doing some processing, or anything you want.
Here, you’ll get to know an easy way to run such a script according to a predefined schedule.
Cron jobs are one of the popular job schedulers that provide a utility for scheduling tasks of any kind in Unix-like computer operating systems. An important thing to keep in mind when we are using cron jobs is cron expressions.
A cron expression is commonly used to let you define when tasks should be run. There are different variants of cron expressions used in systems, like Jenkins, Kubernetes CronJob, Fargate Scheduled Task, etc. Make sure you check its instructions before use.
Let’s jump into the topic.
We would all like to schedule recurring tasks as easy and efficiently as possible. In every complex project, this is needed for updating data, ingesting data, or similar use cases.
With AWS Fargate, it is possible to have orchestrated clusters on which you can run your tasks in Docker containers.
- The Jenkins server installed.
- GitHub account with a GitHub Repository.
- If you’re using an Enterprise GitHub account, first you need to generate the private and public keys (ssh-keygen). Add credentials to the Jenkins server.
- Configure your GitHub account by adding your public key there.
- Set up Docker on your computer. Docker docs will give you more information on that.
- AWS account.
- Dockerfile for the script you want to schedule.
For anyone who doesn’t know about AWS Fargate, here is an introduction.
“AWS Fargate is a compute engine for Amazon ECS that allows you to run containers without having to manage servers or clusters.
With AWS Fargate, you no longer have to provision, configure, and scale clusters of virtual machines to run containers. This removes the need to choose server types, decide when to scale your clusters, or optimize cluster packing.
AWS Fargate removes the need for you to interact with or think about servers or clusters. Fargate lets you focus on designing and building your applications instead of managing the infrastructure that runs them.” — https://aws.amazon.com/fargate/
Before Fargate’s release, the only way to use Amazon ECS was to provide a cluster of EC2 instances managed by Amazon (for software, updates, and configuration).
Costs of the clusters, scaling of tasks, and configuring and maintaining a valid autoscaling system to avoid lacking adequate container resources were the problems faced.
AWS Fargate allows for all of this management overhead to be handed to AWS, i.e., to launch container-based services by paying only for the actual execution time. No need to worry about the underlying cluster as AWS will take care of that just fine.
The Elastic Container Service (ECS) is an AWS service that handles the Docker containers orchestration in your EC2 cluster. It is an alternative for Kubernetes, Docker Swarm, and other orchestration services that also handles the scaling.
One key feature of this is that Amazon ECS lets you run batch workloads with managed or custom schedulers on Amazon EC2 on-demand instances, reserved instances, or spot instances.
So, when containers are run on Amazon EC2 spot instances, we would receive up to a 90% discount compared to on-demand prices. Other than this, you may have many reasons to use Amazon ECS.
Here, in this process, we are going to use AWS Fargate as well. As we use Docker when we deploy our script, which needs to be scheduled, this would be an easier way.
“Amazon Elastic Container Registry (ECR) is a fully-managed Docker container registry that makes it easy for developers to store, manage, and deploy Docker container images.” — https://aws.amazon.com/ecr/
Another reason to use ECR when we work with ECS is that since ECR and ECS are integrated, it reduces the workflow that we have to go through.
So, from here onwards, I’ll use a step-by-step approach to describe how to schedule a script with CI/CD in AWS Fargate. If you have completed the prerequisites for these steps you have the Jenkins server configured, a GitHub repo, and an AWS account.
- Create a repository in ECR with a proper name.
- You can use the toggles shown here to activate tag immutability or scan on pushing the image.
- Create a new item on Jenkins as a freestyle project.
- Configure the Jenkins pipeline to get the source from the GitHub repo and push the Docker image to the repo in ECR.(Go to the Source Code Management section → select option Git.)
If you have already configured Jenkins and GitHub using an ssh-key pair as I mentioned earlier, in the Credentials dropdown, you will find the previously configured key.
You can specify the branch you want to get the code from, which comes in handy in CI/CD.
Go to your ECR and click on the repository that you created. Then click on the view push commands button.
- Next, we should give the push commands that are specified in the ECR repo to Jenkins. ( Go to the Build section → Click Add build step → Select Execute shell from the dropdown.)
- When you’re done configuring it, you can click Apply and then Save. Now the pipeline is done.
- Go to AWS ECS and create the cluster.
- You can select Networking only from the below choices.
- Then, you have to give a name for your cluster and any other configurations, such as create a new VPC, tags, and enabling container insights.
- Once you press the create button, the cluster will be created and you can view the cluster.
- Create the task definition.
- Then you'll get this.
- Now start configuring the task definition in the following view. Here, you can specify IAM roles, task memory, and task CPU according to your task size.
- While you are configuring, you will have to configure the container as well in the following view, which we get after clicking the Add container button.
- Here, the repository URLs/image tag should be taken from the repository you created in ECR. Other configurations, such as adding memory limits, port mapping, and configuring env variables, can also be done here.
- The last step is creating a scheduled task, giving the information on scheduling. (Go to Clusters under Amazon ECS→ Select the cluster you created→ Click the Scheduled Tasks tab→ click the Create button on the top of the table.)
Here you can choose the schedule rule type. You can either select Run at a fixed interval or Cron expression according to your needs. Here, we have used a cron expression. You can get more information on cron expressions here.
If you scroll down, you will be able to see more configurations that you should do regarding the target. You should select Fargate as the launch type and the task definition that you created from the dropdown menu.
VPC and security groups can be configured here as well.
- Once you finish the configuration, you can view your scheduled task under the Scheduled Tasks tab.
So now you can make changes to your script and then push it to the branch you have configured in Jenkins. Then, once you build the image using Jenkins by clicking Build, the Docker image will be pushed to ECR with the tag.
Then, once it’s done, the scheduled job will be run according to the rules you specified. One more thing that you can do is add the GitHub webhook as well so that the pipeline will be fully automated.
Please share your results, observations, doubts, and ideas.
See you soon with another one!