Distributed tracing
Tracing in distributed systems is nothing new. There are many solutions on the web that give you full frontend tracing and monitoring analysis, and they do a good job.
What all these solutions have in common is that they are not globally standardized, so you can not just switch to or integrate with another solution. In most cases, they do not integrate with your backend.
This is changing now, because there is Open Telemetry as a new standard, and in 2021 OpenTelemetry has reached an important milestone: the OpenTelemetry Tracing Specification version 1.0.
What is Opentelemetry
Is a collection of tools, APIs and SDKs. Used to collect telemetry data from distributed systems to troubleshoot, debug, and understand software performance and behavior.
Many modern applications are based on microservices. These are essentially an interconnected network of services, so understanding system performance from multiple sources is a major challenge. A single call in an application can trigger dozens of events.
How can developers and engineers isolate a problem when something goes wrong or a request runs slowly?
Opentelemetry standardized the way and also offers SDKs that allow you to collect data from different systems and in different programming languages to debug your stack at a high level.
All relevant information on the Opentelemtry specification can be found at its official documentation.
Opentelemetry components
-
APIs and SDKs
per programming language for generating and emitting traces (SDK forJava
,.Net
,C++
,Golang
,Python
,Javascript
,PHP
,Ruby
etc...) -
Collectors
- provides a vendor independent implementation for receiving, processing and exporting telemetry data. - The
OTLP Protocol
specification describes the encoding, transport and transmission mechanism of telemetry data. You can read more.
Nowadays, some languages natively support passing trace contexts (trace context propagation), such as .NetCore
, and many cloud providers allow importing or exporting traces from or to the cloud via the otel
protocol.
And that's a good thing, because you can easily reuse an analytics platform and integrate your applications there, or take all the metrics and pass them to your platform.
This is an example of distributed tracing from frontend to backend
You can see all the operations over time, every detail and the logs for each record (span). The entire request flow between Frontend > Backend > Post-Request Async processing
.
This article will not show you how to integrate a fullstack tracing solution. I have a free open source workshop for that, including full working application to handle WebHooks.
This article is exclusively about exporting request traces from your frontend React to the backend Opentelemetry Collector.
Frontend instrumentation
For frontend JavaScript clients, opentelemtry provides the main SDK opentelemetry-js. There are also several additional packages needed for instrumentation and trace export.
NOTE: If the language is not natively supported, use the Opentelemetry SDK. If it is, use a native!
Packages
In most cases, you do not need a full SDK and tracing this request requires the following imports in package.json
:
"dependencies": {
"@opentelemetry/api": "1.0.4",
"@opentelemetry/context-zone": "1.0.1",
"@opentelemetry/exporter-trace-otlp-http": "0.27.0",
"@opentelemetry/instrumentation-document-load": "0.27.0",
"@opentelemetry/instrumentation-fetch": "0.27.0",
"@opentelemetry/sdk-trace-base": "1.0.1",
"@opentelemetry/sdk-trace-web": "1.0.1",
"@opentelemetry/resources": "1.0.1",
}
There are other tools you can use to measure document load time or navigation between pages, etc., but that's not the use case for full request tracing in this article! That has more to do with metrics and performance analysis.
Front-end transactions are often thought of as "loading the entire page, navigation, adding items to cart", etc. This article is about requests and looks at transactions as isolated backend commands like CreateUser
or SubmitForm
that have a single responsibility.
Frontend integration
On the frontend, I mostly use the ideology of provider components
. This is a set of components that wrap around each other on root to provide a specific functionality, such as UserProvider
or EnviromentProvider
or in our case TraceProvider
.
** Please check last sources for latest integration. The API of opentelemetry-js changes in time since it is in dev.**
// Providers.tsx
<EnviromentContext.Provider value={providerInit}>
<EnviromentContext.Consumer>
{(state) =>
state && (
<RelayEnvironmentProvider environment={state?.env}>
<Suspense fallback={fallback ? fallback : null}>
<TraceProvider>
<UserProvider>
<ToastProvider>{children}</ToastProvider>
</UserProvider>
</TraceProvider>
</Suspense>
</RelayEnvironmentProvider>
)
}
</EnviromentContext.Consumer>
</EnviromentContext.Provider>
where <TraceProvider>
is implemented as this:
import React from "react";
import { WebTracerProvider } from '@opentelemetry/sdk-trace-web';
import { SimpleSpanProcessor } from '@opentelemetry/sdk-trace-base';
import { ZoneContextManager } from '@opentelemetry/context-zone';
import { FetchInstrumentation } from '@opentelemetry/instrumentation-fetch';
import { registerInstrumentations } from '@opentelemetry/instrumentation';
import { Resource } from '@opentelemetry/resources';
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-http"
import {TRACES_ENDPOINT} from "../constants"
const collectorOptions = {
url: TRACES_ENDPOINT,
headers: {
"Content-Type": "application/json",
'Access-Control-Allow-Headers': '*',
'X-CSRF': '1',
},
concurrencyLimit: 10,
};
// Trace provider (Main aplication trace)
const provider = new WebTracerProvider({
resource: new Resource({
"service.name": "Frontend",
}
)});
// Exporter (opentelemetry collector hidden behind bff proxy)
const exporter = new OTLPTraceExporter (collectorOptions);
// Instrumentation configurations for frontend
const fetchInstrumentation = new FetchInstrumentation({
ignoreUrls : ["https://some-ignored-url.com"]
});
fetchInstrumentation.setTracerProvider(provider);
provider.addSpanProcessor(new SimpleSpanProcessor(exporter));
provider.register({
contextManager: new ZoneContextManager(),
});
// Registering instrumentations
registerInstrumentations({
instrumentations: [new FetchInstrumentation()],
});
export type TraceProviderProps = {
children?: React.ReactNode;
};
export default function TraceProvider({ children }: TraceProviderProps) {
return (
<>
{children}
</>
);
}
There are a few important points to remember:
- Setting the application name
"service.name": "Frontend"
is important for monitoring the user interface. - When configuring the exporter, send the correct headers like
'X-CSRF': '1'
etc. based on the backend configuration - Specify ignoreUrls - you do not want to track additional system requests or 3-part requests that you are not interested in.
- Specify the URL of the export endpoint. Example:
https://localhost:5015/traces
Trace results
This is an example of a trace sent from the frontend to the collector
{
"resourceSpans": [
{
"resource": {
"attributes": [
{
"key": "service.name",
"value": {
"stringValue": "Frontend"
}
},
{
"key": "telemetry.sdk.language",
"value": {
"stringValue": "webjs"
}
},
{
"key": "telemetry.sdk.name",
"value": {
"stringValue": "opentelemetry"
}
},
{
"key": "telemetry.sdk.version",
"value": {
"stringValue": "1.0.1"
}
}
],
"droppedAttributesCount": 0
},
"instrumentationLibrarySpans": [
{
"spans": [
{
"traceId": "d6d75718930b3558e4fe0808877f8e80",
"spanId": "3b7f9b452a7b5ddf",
"name": "HTTP POST",
"kind": 3,
"startTimeUnixNano": 1644389713311600000,
"endTimeUnixNano": 1644389713673100000,
"attributes": [
{
"key": "component",
"value": {
"stringValue": "fetch"
}
},
{
"key": "http.method",
"value": {
"stringValue": "POST"
}
},
{
"key": "http.url",
"value": {
"stringValue": "/graphql"
}
},
{
"key": "http.status_code",
"value": {
"intValue": 200
}
},
{
"key": "http.status_text",
"value": {
"stringValue": ""
}
},
{
"key": "http.host",
"value": {
"stringValue": "localhost:5015"
}
},
{
"key": "http.scheme",
"value": {
"stringValue": "https"
}
},
{
"key": "http.user_agent",
"value": {
"stringValue": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.82 Safari/537.36"
}
},
{
"key": "http.response_content_length",
"value": {
"intValue": 168
}
}
],
"droppedAttributesCount": 0,
"events": [
{
"timeUnixNano": 1644389713312300000,
"name": "fetchStart",
"attributes": [],
"droppedAttributesCount": 0
},
{
"timeUnixNano": 1644389713312300000,
"name": "domainLookupStart",
"attributes": [],
"droppedAttributesCount": 0
},
{
"timeUnixNano": 1644389713312300000,
"name": "domainLookupEnd",
"attributes": [],
"droppedAttributesCount": 0
},
{
"timeUnixNano": 1644389713312300000,
"name": "connectStart",
"attributes": [],
"droppedAttributesCount": 0
},
{
"timeUnixNano": 1644389713312300000,
"name": "secureConnectionStart",
"attributes": [],
"droppedAttributesCount": 0
},
{
"timeUnixNano": 1644389713312300000,
"name": "connectEnd",
"attributes": [],
"droppedAttributesCount": 0
},
{
"timeUnixNano": 1644389713314500000,
"name": "requestStart",
"attributes": [],
"droppedAttributesCount": 0
},
{
"timeUnixNano": 1644389713670100000,
"name": "responseStart",
"attributes": [],
"droppedAttributesCount": 0
},
{
"timeUnixNano": 1644389713670800100,
"name": "responseEnd",
"attributes": [],
"droppedAttributesCount": 0
}
],
"droppedEventsCount": 0,
"status": {
"code": 0
},
"links": [],
"droppedLinksCount": 0
}
],
"instrumentationLibrary": {
"name": "@opentelemetry/instrumentation-fetch",
"version": "0.27.0"
}
}
]
}
]
}
Opentelemetry collector
To run the collector in the backend you can use the attached file docker-compose.yml
to set up and configure the simple collector. Take this as an example. You still need to export the data from the collector to the Trace Analytics software. I can recommend you:
- Elastic Stack - High performance self and cloud hosted solution
- Jaeger tracing - Self hosted, Easy to start
This article does not explain how to set up a full collector with analytics service and storage. If you want to see real example, you can read and try my free opensource workshop on github. There are also advanced concepts such as BFF patterns and hiding Collector and API behind proxy.
This sample collector receives data from the source via grpc
or http
and exports it back to a storage or analysis service via the grpc
using otel
protocol
Collector compose file:
version: '3'
services:
opentelemetry-collector:
container_name: opentelemetry-collector
hostname: opentelemetry-collector
image: otel/opentelemetry-collector:0.43.0
command: [ "--config=/etc/otel-collector-config.yml" ]
volumes:
- ./otel-collector-config.yml:/etc/otel-collector-config.yml
ports:
- "14250:14250"
- "55680:55680"
- "55690:55690"
networks:
- tracing
networks:
tracing:
driver: bridge
Collector config file:
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:55680
http:
endpoint: "0.0.0.0:55690"
processors:
batch:
exporters:
otlp/2:
endpoint: apm-server:8200
tls:
insecure: true
logging:
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [logging, otlp/2]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [logging, otlp/2]
Jaeger docker-compose.yaml
version: '3'
services:
jaeger:
image: jaegertracing/all-in-one:latest
ports:
- "16686:16686"
- "14268"
- "14250"
networks:
- jaeger-example
networks:
jaeger-example:
Top comments (0)