Francesco Ciulla for TinyStacks, Inc.

Posted on Sep 1, 2021 • Edited on Feb 14, 2023 • Originally published at blog.tinystacks.com

Service Discovery with AWS Cloud Map

One of the key challenges in a microservices architecture is discovering the current network location of services. In this article, we'll review why service discovery is so challenging, and take a hands-on peek at how AWS Cloud Map can simplify this complex task.

Article by Jay Allen

Why Service Discovery?

A microservices architecture consists of atomizing an application into a series of discrete, loosely coupled services. They stand in contrast to monolithic architectures, in which all of the services required by an application are bundled into a single, large unit of deployment. Separating and decoupling services makes it easier to deploy small changes rapidly.

But this flexibility also injects complexity. Microservices architectures are often implemented using lightweight serverless technologies, such as Docker containers or serverless functions (AWS Lambda, Azure Functions). A given microservice may be split across multiple execution units; e.g., a service hosted in Docker containers may run in multiple tasks across multiple cluster instances in Amazon ECS, each with different IP addresses.

The complexity only gets worse when we consider the full application lifecycle. Services will need to work across different deployment stages (dev, test, stage, prod). Additionally, a service will likely have several versions running simultaneously for backwards compatibility.

All of this raises the question: How does a service's clients find the correct endpoint for the correct version? This is the problem that service discovery was created to solve.

Traditional Approaches to Service Discovery

Service discovery consists of providing either a static or dynamic method for a service's clients to connect to an instance of a service. There are two general approaches to service discovery:

Client-based discovery. A service's clients connect to a service registry, a database listing the most current information about the service. The client uses a logical naming scheme to look up the service by a known identifier, and the registry returns one or more DNS names or IP address endpoints where the service is hosted.
Server-based discovery. The client connects to a known server-side endpoint, such as a load balancer. The server is then responsible for resolving the request to a healthy, running instance of the service. AWS Elastic Load Balancing is one of the most well-known examples of such an approach.

While both approaches have benefits and drawbacks, client-side discovery generally involves fewer moving parts and server hops compared to server-side discovery.

AWS Cloud Map

In the past, implementing client-side discovery has meant standing up yet another highly available, fault-tolerant service that clients can call. This can add significant time and cost to both application development as well as operational maintenance.

This is where AWS Cloud Map comes in. AWS Cloud Map is a client-side service registry and service discovery solution provided as a ready-to-use, highly available service. Rather than build your own client service registry, you can leverage AWS Cloud Map to register your application and its running instances, and then use either the AWS Cloud Map API or DNS lookup to resolve a service's name to a current active endpoint.

As with most AWS services, leveraging Cloud Map lets you leave the heavy lifting to AWS while you focus on what matters most to you: your application and the unique functionality that it provides to your users.

Creating an AWS Cloud Map Namespace and Service

Let's see how you can leverage Cloud Map in real life. This walkthrough will build upon my last article in which we stood up a Flask-based API in a Docker image on Amazon ECS using CodePipeline and CodeBuild.

To get started, log in to the AWS Management Console and, in the Services search bar, look for cloud map.

To get started, we first need to create a Cloud Map namespace. A namespace is a label that groups a number of services together.

To create a namespace, click the Create namespace button.

You'll be asked to supply several values here. Let's step through each in detail:

Namespace name. This, along with the service name, is how your application will look up the endpoint for a service. Characters in your namespace name are restricted to a strict subset of ASCII characters. Additionally, if you plan to use DNS to perform service discovery, your namespace name must end in a top-level domain name.
Namespace description. Freeform text describing the purpose of your namespace. We'll leave this blank for now.
Instance discovery. There are three ways your applications can perform a service discovery lookup:
- API calls. Use the AWS CLI, a language-specific AWS API library (like Boto3 for Python), or REST calls over HTTP.
- API calls and DNS queries in VPC. Creates DNS entries local to an Amazon VPC, allowing lookup using DNS queries.
- API calls and public DNS queries. Creates public DNS records that can be resolved with calls to a public DNS server.

For our walkthrough, use a Namespace name of test-namespace. Leave the Namespace description field blank. For now, leave Instance discovery set to API calls. Once done, click Create namespace.

Your namespace should be available in a few moments. Once it's ready, click on the namespace's name to view its details page.

A namespace can contain multiple services. Let's add our Flask API service to it now by clicking the Create service button.

In this dialog, we have three options:

Service name. A friendly name that helps you identify the service in the AWS Managment Console.
Service description. A freeform description of the service and the purpose it serves.
Health check configuration. The Cloud Map health check works similarly to the health checks used in Elastic Load Balancing. Once you create a service, you'll register application instances that belong to that service. If you have health checks enabled, AWS Cloud Map will only return services that are registering as healthy. You have three options:
- No health check. A service instance is returned regardless of its health status.
- Route 53 health check. Utilizes Route 53's health check feature .
- Custom health check. Uses a third-party tool to perform the health check.

For Service name, enter flask-test. Leave Service description empty and leave Health check configuration set to No health check. When done, click Create service.

Registering a Service Instance with AWS Cloud Map

You now have a namespace and a service. However, the service still doesn't have any running instances. Whenever you bring a new instance of your application online, you'll need to add it to your service so it can be returned in a query.

You may recall that, in my last article on CodePipeline and CodeBuild , we stood up a running Docker image in an Amazon ECS Fargate cluster. That stood up a service named ts-flask-test-service, as shown below.

To register this as a service instance, we only need a few pieces of information:

The auto-generated service ID for our service, which you wrote down earlier.
The IP address of the service and the port on which it's available.

Since this will occur dynamically when you start up a new instance of your application, you'll want to be able to add and remove instances programmatically. This can be done using the AWS CLI, a language-specific AWS SDK, or REST API calls made directly over HTTP.

For example, to add our running Docker instance to the service using the AWS CLI, we could use the following command:



aws servicediscovery register-instance --service-id "srv-3hxpwincbakdijl5" --instance-id "instance1" --attributes="AWS_INSTANCE_IPV4=35.166.44.63,AWS_INSTANCE_PORT=80"

(Note that the officially supported arguments in the attributes parameter string are case-sensitive and must all be capitalized.)

What if you're using auto scaling with ECS? In this case, ECS will start and stop service task instances in response to service demand. Fortunately, you can configure your ECS service at creation time to integrate with Cloud Map. For example, the AWS CLI call aws ecs create-service supports the --service-registries parameter for associating an ECS service with an AWS Cloud Map service.

Looking up a Service Instance

The last piece is for your clients - applications and other services - to call AWS Cloud Map to retrieve a list of available endpoints for the service. Using the AWS CLI, this can be accomplished with the call aws servicediscovery discover-instances. You simply call this with the name of the namespace and services from which you want to return a list of healthy instances:



aws servicediscovery discover-instances --namespace-name "test-namespace" --service-name "flask-app"

The result will be a list of healthy instances. In our case, we only see a single instance returned as there is only one instance available.

How Routing Policies and Health Checks Influence Instance Lists

Which instances are returned when listing instances may vary depending on several variables you can set when creating your service with AWS Cloud Map .

The first factor is the routing policy. This setting is available when you are using private or public DNS namespaces for instance lookup. Two values are supported:

Weighted routing. A single instance is selected randomly, regardless of any considerations such as current traffic load.
Multivalue answer routing. DNS returns a list of up to eight healthy instances. (if you aren't using health checks, AWS Cloud Map returns the values for up to eight instances.)

The second factor is health checks. If a health check is defined and an instance is failing (e.g., because it has too many active connections), the instance will be marked as unavailable and will not be returned in AWS Cloud Map queries until it is once again healthy. If no health check is defined, all instances are assumed to be healthy.

Discover Instances from the AWS SDK

You can also discover instance easily from programming languages that have an AWS SDK. Below is an example Python 3.9 script that retrieves a list of available service endpoints from AWS Cloud Map for the service above:



import boto3

client = boto3.client('servicediscovery')

instances = client.discover_instances(NamespaceName='test-namespace', ServiceName='flask-app')
print(instances)

Using the AWS SDK, you can directly embed awareness of AWS Cloud Map into your clients with just a few lines of code.

Specifying Attributes on Instance Registrations

Earlier, I discussed how you will likely need to manage multiple versions and deployment stages for your service. It's likely you'll have several supported versions running at once across dev, test, stage, and prod.

Fortunately, this scenario can be supported very simply by using custom attributes. Let's return to our register-instance call from earlier and add a few attributes of our own design called stage and version:



aws servicediscovery register-instance --service-id "srv-3hxpwincbakdijl5" --instance-id "instance1" --attributes="AWS_INSTANCE_IPV4=35.166.44.63,AWS_INSTANCE_PORT=80,stage=dev,version=1.0.0"

We can then alter our discover-instances calls to filter on these attributes:



aws servicediscovery discover-instances --namespace-name "test-namespace" --service-name "flask-app" --query-parameters "stage=dev,version=1.0.0"

This will scope the results down to those instances specific to our desired deployment stage and version.

TinyStacks and AWS Cloud Map

I've discussed before how TinyStacks simplifies deploying applications on AWS. Here's yet another good example, as TinyStacks creates AWS Cloud Map namespaces and services as the simplest way to load balance traffic from API Gateway between container tasks on ECS. This means that, with zero additional coding, your microservice can make itself discoverable by, and available to, other applications and services. Contact us today to get set up with TinyStacks and give it a try!

Hands-on debugging session: instrument, monitor, and fix

Join Lazar for a hands-on session where you’ll build it, break it, debug it, and fix it. You’ll set up Sentry, track errors, use Session Replay and Tracing, and leverage some good ol’ AI to find and fix issues fast.

RSVP here →