DEV Community

Martin Nanchev for AWS Community Builders

Posted on

The good, bad and ugly about Canary functions

  1. The Good

Synthetics monitoring was in preview on 25.11.2019. It was released in 2020.

Synthetic monitoring is a way to test a service availability from user point of view. This means that you could schedule a script with a minimum of 1 minute interval to measure response time in [ms], status code, login to specific service, check video url if they are broken and many more.

In the last project I even plan to use it for DynamoDB, DocumentDB and RDS monitoring. Everything is possible from making simple GET, PUT, POST to crawling a website, testing for broken URLs, UI testing, A/B testing and etc.

In this article i will present a non production example of canary function to test RESTful API.

Let us begin with the REST API. First we will define a async function loadBlueprint, which will load a each url from a list. The page object provides interface to interact with single tab in Chromium headless browser. We will define list of 1 url and we will hard-code the username and password and use login function to get a token for Cognito. After that the token will be used to sign user request to the RESTful API. To do that we add Authorization header with value of “Bearer”+”bearerToken”. The loadUrl function will perform the logic to determine if a service is healthy or not.

1.1. Function example


The login function is simple. The COGNITO client id is specified and username and password were hard-coded. For production I would suggest, that you use Secrets Manager. Canaries come with version of AWS SDK and there is no problem to obtain a secret, if you add the GetSecretValue permission to the canary role. A KMS permissions and a key resource based policy may also be required, as best practices require the encryption of secrets.

We can execute multiple steps like making GET, POST and PUT request to guarantee the service availability. By doing this you can also test the the availability of services, that the main service depends on, like Database(Enabling services). Below is the loadUrl function, which will perform signed GET request against the URL, it will return, when the dom content was loaded and it will take a screenshot. (It is not necessary for REST API, but it is a nice feature). It will timeout after 30 seconds, which is adjustable.

2. The bad and ugly — COSTS:

2.1. Assumption: We will deploy one canary with 1 minute execution interval:

NumberOfExecutions [60 canary runs per hour, hour of day, days in month]= 60*24*30 = 43200

PricingOneRunFrankfurt = $ 0.0016

Alarm pricing = $ 0.1 per alarm

The total costs for one canary is 69.22. I would suggest to be careful with canaries, because the cost could go up with each canary.

3. How it looks like in the console:

The whole code is available under.


**Summary: **AWS canaries allow you to perform test and to find issues with a system, before your user. It is faster than Cloudwatch and it gives you what is the impact. The problem with Cloudwatch monitoring alone is, that you don’t know what is the impact, when you receive DiskQueueDepth alarm and you don’t know how this spike is affecting the end users. The canaries allow you better visibility of the application.

Be careful with the number of canaries and the interval between the runs.

IMPORTANT NOTES: The canary require a role with permissions to access AWS service. In production I deployed them in VPC with enabled DNS resolution and support, which allows me to determine the health of private endpoints.

Top comments (0)