Tom Weiss for Aspecto

Posted on Aug 26, 2021 • Edited on Sep 29, 2021 • Originally published at aspecto.io

How to Deploy Jaeger on AWS: a Comprehensive Step-by-Step Guide

#aws #cloud #webdev #microservices

This article is part of the Aspecto Hello World series, where we tackle microservices-related topics for you. Our team searches the web for common issues, then we solve them ourselves and bring you complete how-to guides. Aspecto is an OpenTelemetry-based distributed tracing platform for developers and teams of distributed applications.

Introduction

In this tutorial, I will be showing you how to host jaeger on AWS ECS. We will do so step by step: set up an AWS Elasticsearch service domain and use it as a storage backend. For this purpose, we will use the jaeger all-in-one image inside our own ECS cluster & service.

(Note: for a production use case you’d probably want to use the jaeger images separately and not the all-in-one. We’re doing this to simplify the blog post).

What is Jaeger

If you’re here you probably already know this, but just in case: jaeger is an open-source distributed tracing system, originally developed by Uber.

Essentially it stores traces & spans (in a storage backend) and hosts a UI that gives us visibility on these traces and spans (if you’re not familiar with the OpenTelemetry jargon, you can get more info here: https://opentelemetry.io/).

A small note about security

Before we dive in – it is important to know that the blog post assumes you are running your ES & jaeger in private subnets inside a secured VPC. For additional security measures, go to https://www.jaegertracing.io/docs/1.25/security/ and check out the official recommendations from Jaeger.

Setting up AWS Elasticsearch instance

Since the jaeger collector is persistent – it requires a storage backend. You could use memory as a storage backend but it is not suitable for production use cases.

So we will begin by creating a new elasticsearch domain to serve as a storage backend for our Jaeger.

Go to AWS ES console.
Click on “Create a new domain”
Select custom & latest version of ES and hit Next.
Let’s set the following settings (but you can of course change them as needed):

Name: as you wish
No auto-tuning (you may want this in production)
1 AZ, 1 Node (note that in production you would probably want at least 2 AZ & nodes for redundancy)
Instance type: T2.small
No master node (you may want this in production)

After everything is set and done – hit Next.

5.Configuring access & security:

Since most likely you’re going to send sensitive data to your Jaeger, you want to make sure it is stored securely. Therefore, it is recommended to only allow access within your VPC. AWS recommends a dedicated subnet for each elasticsearch domain. For this tutorial’s purposes, I am choosing an existing private subnet.

As for security groups, I created a security group that allows open access inside the VPC. For this blog post – I’m ok with it since the elasticsearch is only accessible from private subnets. In production, you may want to be stricter and enable fine-grained access control / different & IP, port settings.

For domain access policy – to simplify the installation and since we’re inside the VPC – I chose to allow open access to the domain. But here also, for production, you may want to be stricter and allow access from a specific IP address or IAM ARN (like the jaeger IP address for example).

As for encryption – I’m leaving the defaults, but feel free to modify it to fit your needs.

6.At this point, we’re done. Let’s hit next as needed and let AWS create our ES domain.

Deploying Jaeger

In order to deploy Jaeger, we will be using AWS ECS. Jaeger provides us with an all-in-one docker image that contains everything that Jaeger needs to work.

Let’s create an ECS service running that image.

Creating our new ECS cluster

Go to ECS Cluster -> Create cluster
Select EC2 Linux + Networking
Configure the cluster:
As for subnets – since I have an AWS VPN client enabled, I put the Jaeger on two private subnets, so that no one from outside the VPC can access its data.
For the security group – let’s create a new one and add inbound rules for the following ports from within the VPC. Replace 10.100.0.0/16 with your own VPC address range. You can learn more about the Jaeger ports here: https://www.jaegertracing.io/docs/1.25/getting-started/.
Change anything else if you need it and hit create.

Creating a Task Definition

For us to create an ECS Service we need to define a Task Definition. Head over to Task Definition and hit Create.

Since we chose EC2, we will choose EC2 as for type compatibility:
Select “Configure from JSON”. Paste the following JSON in the relevant field:

{
 "requiresCompatibilities": [
   "EC2"
 ],
 "inferenceAccelerators": [],
 "containerDefinitions": [    {
   "dnsSearchDomains": [],
   "environmentFiles": null,
   "logConfiguration": null,
   "entryPoint": [],
   "portMappings": [
     {
       "hostPort": 14269,
       "protocol": "tcp",
       "containerPort": 14269
     },
     {
       "hostPort": 14268,
       "protocol": "tcp",
       "containerPort": 14268
     },
     {
       "hostPort": 6832,
       "protocol": "udp",
       "containerPort": 6832
     },
     {
       "hostPort": 6831,
       "protocol": "udp",
       "containerPort": 6831
     },
     {
       "hostPort": 5775,
       "protocol": "udp",
       "containerPort": 5775
     },
     {
       "hostPort": 14250,
       "protocol": "tcp",
       "containerPort": 14250
     },
     {
       "hostPort": 16685,
       "protocol": "tcp",
       "containerPort": 16685
     },
     {
       "hostPort": 5778,
       "protocol": "tcp",
       "containerPort": 5778
     },
     {
       "hostPort": 16686,
       "protocol": "tcp",
       "containerPort": 16686
     },
     {
       "hostPort": 9411,
       "protocol": "tcp",
       "containerPort": 9411
     }
   ],
   "command": [
     "--collector.zipkin.host-port",
     "9411"
   ],
   "linuxParameters": null,
   "cpu": 1024,
   "environment": [
     {
       "name": "ES_SERVER_URLS",
       "value": "https://some-es-name.some-es-region.es.amazonaws.com"
     },
     {
       "name": "SPAN_STORAGE_TYPE",
       "value": "elasticsearch"
     }
   ],
   "resourceRequirements": null,
   "ulimits": null,
   "dnsServers": [],
   "mountPoints": [],
   "workingDirectory": null,
   "secrets": null,
   "dockerSecurityOptions": [],
   "memory": 1024,
   "memoryReservation": null,
   "volumesFrom": [],
   "stopTimeout": null,
   "image": "jaegertracing/all-in-one:1.25.0",
   "startTimeout": null,
   "firelensConfiguration": null,
   "dependsOn": null,
   "disableNetworking": null,
   "interactive": null,
   "healthCheck": null,
   "essential": true,
   "links": [],
   "hostname": null,
   "extraHosts": null,
   "pseudoTerminal": null,
   "user": null,
   "readonlyRootFilesystem": null,
   "dockerLabels": null,
   "systemControls": [],
   "privileged": null,
   "name": "tom-jaeger-new"
 }],
 "volumes": [],

 "networkMode": null,
 "memory": "2048",
 "cpu": "2048",
 "placementConstraints": [],
 "family": "tom-jaeger-new",
 "taskRoleArn": "arn:aws:iam::YOUR_AWS_ACCOUNT_ID:role/ecsTaskExecutionRole",
 "executionRoleArn": "arn:aws:iam::YOUR_AWS_ACCOUNT_ID:role/ecsTaskExecutionRole",
 "tags": []
}

Make sure to make the relevant modifications like your own account id, and copy the correct elasticsearch URL from the AWS ES console, and put it as an environment variable called ES_SERVER_URLS.

Creating the ECS Service

Now we’re ready to create the ECS service that runs Jaeger.

Go back to your Jaeger cluster -> services -> create.
This is the configuration you want (be sure to select the newly created task definition):

3.Unlike what the screenshot suggests, instead of minimum 100, maximum 200 -we use 0-100 so that we only have 1 instance running and no port issues. Again, we do this to simplify the tutorial.

4.From here on you can hit next until creating the service. For this tutorial I chose not to create a load balancer. If you feel you need one feel free to create it.

Accessing Jaeger UI

Now Jaeger should be up and running. Let’s go to the ECS cluster -> ECS instances -> click on the instance id inside the ECS instances column. That should lead you to the corresponding EC2 instance.

Clicking on its ID should give you info about the instance. What you are looking for is the instance IP.

I’m assuming that you are inside the VPC. If you’re not, you may have to find a different way of obtaining network access to your Jaeger like placing it in a public subnet & allowing access to your IP (it is not recommended from a security standpoint).

Copy the private IP address of the instance, and head over to it in the browser with port 16686. For example: 10.100.30.224:16686

You now have access to the Jaeger UI:

Sending traces to Jaeger

At this point you have the jaeger UI running. Now we need to start sending traces to it.

If you take a look at Kibana, you can already see that Jaeger created its own jaeger-span-DATE-FORMAT. Currently, it only contains internal Jaeger spans, but let’s send our own.

Note: for the simplicity of this tutorial we did not implement any index rollover, but you may want to do so to optimize resources allocated to indices. You can read more about this here: https://www.jaegertracing.io/docs/1.25/deployment/#elasticsearch-rollover

Sending traces to Jaeger with the OpenTelemetry SDK (In NodeJS)

Step 1:

npx express-generator

Step 2: Perform npm install

npm install

Step 3: Install OpenTelemetry libraries:

npm install --save @opentelemetry/instrumentation-http 
@opentelemetry/instrumentation-express @opentelemetry/api 
@opentelemetry/node @opentelemetry/exporter-jaeger

Step 4: Create a tracing.js file (replace the relevant IP to your EC2 IP)

'use strict';

const opentelemetry = require('@opentelemetry/api');

// Not functionally required but gives some insight what happens behind the scenes
const { diag, DiagConsoleLogger, DiagLogLevel } = opentelemetry;
diag.setLogger(new DiagConsoleLogger(), DiagLogLevel.DEBUG);

const { registerInstrumentations } = require('@opentelemetry/instrumentation');
const { NodeTracerProvider } = require('@opentelemetry/node');
const { SimpleSpanProcessor } = require('@opentelemetry/tracing');
const { Resource } = require('@opentelemetry/resources');
const { SemanticResourceAttributes: ResourceAttributesSC } = require('@opentelemetry/semantic-conventions');
const { JaegerExporter } = require('@opentelemetry/exporter-jaeger');

const Exporter = JaegerExporter;

const { ExpressInstrumentation } = require('@opentelemetry/instrumentation-express');
const { HttpInstrumentation } = require('@opentelemetry/instrumentation-http');

module.exports = () => {
 const serviceName = 'serviceName';

 const provider = new NodeTracerProvider({
   resource: new Resource({
     [ResourceAttributesSC.SERVICE_NAME]: serviceName,
   }),
 });
 registerInstrumentations({
   tracerProvider: provider,
   instrumentations: [
     // Express instrumentation expects HTTP layer to be instrumented
     HttpInstrumentation,
     ExpressInstrumentation,
   ],
 });

 const exporter = new Exporter({
   host: '10.100.40.132',
   port: 6832,
 });

 provider.addSpanProcessor(new SimpleSpanProcessor(exporter));

 // Initialize the OpenTelemetry APIs to use the NodeTracerProvider bindings
 provider.register();

 return opentelemetry.trace.getTracer('express-example');
};

Step 5: At the top of the app.js file, add this line to enable tracing:

require('./tracing')();

Step 6: Run npm start & go to http://localhost:3000/ in the browser

Step 7: Back in the Jaeger UI – you can see this trace now:

Bonus (faster route) – Sending traces to Jaeger with the Aspecto SDK:

Aspecto provides a free and easy-to-use SDK that can be configured to export traces to Jaeger (and to Aspecto, that enables additional abilities that Jaeger does not have) with only one line of code.

Step 1: Create a new express app using

npx express-generator

Step 2: Perform npm installs

npm install
npm install @aspecto/opentelemetry

Step 3: Register for free at www.aspecto.io, and obtain your API key.

Step 4: At the top of your app.js file, add this (before any import):

require('@aspecto/opentelemetry')({
  local:true,
  aspectoAuth: '*your-aspecto-api-key-goes-here*',
  customZipkinEndpoint: 'http://10.100.40.132:9411/api/v2/spans',
  otCollectorEndpoint: 'http://10.100.40.132:9411/v1/trace',
});

Do not forget to change the IP to your own EC2 IP address.

Step 5: Let’s modify the route in index.js to /hello and open the browser at localhost:3000/hello.

Step 6: And voila! our Jaeger is showing the span we just sent: