Dan Gross

Posted on Nov 16, 2023 • Edited on Dec 12, 2023

Observe This: One Step from Trace to Test with Skyramp

#observability #kubernetes #monitoring #microservices

KubeCon North America 2023 in Chicago last week was a barrage of technology and innovation all aimed at running distributed applications at scale with Kubernetes. In particular, the observability topic featured prominently in technical sessions, vendor booths, and in hallway conversations.

Observability plays a key role in gaining insights about a system's behavior in order to ensure the reliability and performance of deployed applications. Because of the complex nature of distributed applications leveraging microservices, an important aspect of observability is tracing. In this blog, we'll explore how you can harness observability tracing to create effective test scenarios easily with Skyramp.

The Anatomy of Tracing

Observability tracing involves capturing and analyzing the flow of requests as they traverse through the different components of a distributed system. Tools like Jaeger, OpenTelemetry, and Pixie can be used in Kubernetes environments to trace the execution of requests across microservices.

Two Key Components of Tracing

Spans: Spans represent individual units of work or operations within a distributed system. Spans will include metadata such as the operation name, start and end timestamps, and contextual information like tags and logs.
Traces: Traces consist of a collection of spans, forming a sequence that traces the journey of a request. Often, spans are collected and assembled into traces based on their shared trace ID.

Traces Versus Logs

Traditional logging is still valuable from an observability standpoint. Logs offer detailed information about individual events within an application based on time, aiding in debugging and troubleshooting. However, if you've combed through the logs produced from a distributed application, you know it can be difficult to gain end-to-end visibility of a request. Observability tracing focuses on providing a higher-level view of the flow of requests through a distributed system, aiding in performance analysis and system optimization. Think of traces like enhanced logs. Both can and should be used in tandem for your observability strategy in Kubernetes.

A Sample Trace

Below is an short excerpt from a sample trace file, which can be found here in Skyramp's public repo, sample-microservices.

  {
    "Source": "px-operator/93bc2253481b44a32e04a523286b130a01e7a6e2168db1be5c02993f1978q75",
    "Destination": "product-catalog-service",
    "RequestBody": null,
    "ResponseBody": {
      "categories": [
        "accessories"
      ],
      "description": "Add a modern touch to your outfits with these sleek aviator sunglasses.",
      "id": "OLJCESPC7Z",
      "name": "Sunglasses",
      "picture": "/static/img/products/sunglasses.jpg",
      "priceUsd": {
        "currencyCode": "USD",
        "nanos": 990000000,
        "units": 19
      }
    },
    "Headers": {
      "Accept-Encoding": "gzip",
      "Host": "product-catalog-service:60000",
      "User-Agent": "Go-http-client/1.1"
    },
    "Method": "GET",
    "Path": "/get-product?product_id=OLJCESPC7Z",
    "Status": 200,
    "ResponseMsg": "",
    "Port": 0
  },

The excerpt above represents one request in a series of requests in the trace.json file. You can see quite a bit of extended information in the form of metadata about this request operation as made available from a span.

Generating Tests from Traces

Observability traces can give us valuable insights into the behavior of our distributed applications for troubleshooting and optimization. So, can we also use traces for generating test scenarios? With Skyramp, you can! Creating tests based on observability traces provides several advantages:

Realistic Scenarios: Traces capture real interactions between microservices, allowing you to recreate realistic scenarios for testing.
Performance Analysis: Identify performance bottlenecks and latency issues by analyzing trace data during test executions.
Error Detection: Leverage traces to detect errors and exceptions, facilitating the creation of robust test cases to handle various failure scenarios.
Regression Testing: Use trace data to establish a baseline for application behavior, enabling effective regression testing as your application evolves.
Early Detection: Catching issues early before deploying to production saves time and money. Using traces a basis for test scenarios helps avoid "testing in prod."

Furthermore, these kinds of functional integration tests, known as service tests or component tests, are often overlooked and yet an important layer in the test pyramid. This is also the area of testing where Skyramp shines.

The above image is from an article related to testing on martinfowler.com.

Skyramp on the Scene

Skyramp provides an array of features to help developers and platform engineers create, manage, and run test scenarios for microservices in Kubernetes clusters.

Previous blog posts have shown how to generate tests based on API schema definitions with OpenAPI or Protocol Buffers. For example, this blog covers test generation with OpenAPI: Test Generation for Distributed Apps Made Easy with Skyramp. However, there are many options available for generating tests, as you can see from the flag options in the Skyramp CLI:

skyramp tester generate <flags>
      --address string              destination address of tests
      --alias string                Kubernetes service / Docker alias name
      --api-schema string           path to API schema file, or URL (URL support for OpenAPI 3.x only)
      --cluster-id string           cluster id from telemetry provider
  -n, --namespace string            Kubernetes namespace where Skyramp worker resides
      --openai                      (experimental) use OpenAI to generate test values (the 'OPENAI_API_KEY' environment variable must be set with an OpenAI API token)
      --openai-model string         (experimental) Optional, GPT model to use for OpenAI (one of [gpt-3.5-turbo gpt-4]). Note that some models may not accesible based on the API token (default "gpt-3.5-turbo")
      --openapi-tag string          tag to filter on (for openapi protocol)
      --output-prefix string        prefix for generated files
      --port int                    port number for the service
      --proto-service string        proto service to utilize (for protobuf protocol)
      --protocol string             protocol to use for the test configuration (one of [grpc rest])
      --start-time string           start time to retrieve traces from
      --telemetry-provider string   telemetry provider, currently only pixie is supported
      --trace-file string           trace file path

For this blog, we will focus on generating tests from observability traces. You can accomplish this based on trace data captured directly from a telemetry provider or using an exported trace file as the input.

One Step Generation

You will see in the skyramp/rest-demo folder from the sample-microservices repo referenced above that there is a trace folder containing an example trace file discussed earlier. This particular trace file was created by calling the running system in the cluster with curl commands and exporting the resulting trace.

Furthermore, the tests/test-trace-bOl4.yaml demo test description as well as the scenarios/scenario-trace-bOl4.yaml file were automatically generated from the example trace file using Skyramp. To illustrate, this how to generate the same test scenario with the Skyramp CLI:

skyramp tester generate \
--trace-file trace/trace.json \
--protocol rest

Output:

Successfully created/updated the test description configuration files: 
        scenarios/scenario-trace-XI0m.yaml
        tests/test-trace-XI0m.yaml

It really is that easy!

Another way to capture traces for generating test scnearios is by passing in a telemetry provider to skyramp generate. In the following example, we are using Pixie as the provider:

skyramp tester generate \
--telemetry-provider pixie \
--cluster-id cid1 \
--namespace ns1 \
--start-time "-5m"

Whichever way you choose, generating tests with Skyramp is just one step.

Executing the Tests

Once the test scenarios are generated from observability traces, we can easily execute them to playback the sequence in the trace with Skyramp. Here, we run the test that we generated above against our system-under-test:

skyramp tester start test-trace-bOl4

And we get our results of the test run:

Starting tests
Tester finished
Test trace-test------
 [Status: finished] [Started at: 2023-11-15 15:46:00 PST] [End: 2023-11-15 15:46:00 PST] [Duration: 0s]
  - pattern0.scenario_x2Dk
    [Status: finished] [Started at: 2023-11-15 15:46:00 PST] [Duration: 0s]
  - pattern0.scenario_x2Dk.0.POST_q9FJ
    [Status: finished] [Started at: 2023-11-15 15:46:00 PST] [Duration: 0s]
    Executed: {"success":"200 OK"}
  - pattern0.scenario_x2Dk.1.assert
    [Status: finished] [Started at: N/A]
    Assert: requests.POST_q9FJ.code == 200
    Passed: true
  - pattern0.scenario_x2Dk.2.GET_D0WG
    [Status: finished] [Started at: 2023-11-15 15:46:00 PST] [Duration: 0s]
    Executed: {"user_id":"eeeee","items":[{"product_id":"OLJCESPC7Z","quantity":2}]}
  - pattern0.scenario_x2Dk.3.assert
    [Status: finished] [Started at: N/A]
    Assert: requests.GET_D0WG.code == 200
    Passed: true
  - pattern0.scenario_x2Dk.4.POST_qwWE
    [Status: finished] [Started at: 2023-11-15 15:46:00 PST] [Duration: 0s]
    Executed: {"order_id":"149bbe85-c41c-443a-b273-53551c1b52f3","shipping_tracking_id":"00be572e-02d4-41a5-b1ac-f50c87c2cc8a","shipping_cost":{"currency_code":"USD","units":10,"nanos":100},"shipping_address":{"street_address":"1600 Amp street","city":"Mountain View","state":"CA","country":"USA"},"items":[{"item":{"product_id":"OLJCESPC7Z","quantity":2}}]}
  - pattern0.scenario_x2Dk.5.assert
    [Status: finished] [Started at: N/A]
    Assert: requests.POST_qwWE.code == 200
    Passed: true

All the tests passed, so this was a successful test run. Of course, you can modify the test scenarios manually if you choose. You can introduce new steps or inject alternative data. Modifications can be valuable for troubleshooting behavior of the microservices in the system-under-test.

All of the examples above are available for you to try for yourself. Simply follow the links provided or visit the Skyramp Docs for additional guidance.

Leaving with a Trace

Observability tracing is a powerful mechanism for understanding the intricacies of distributed systems. By leveraging traces to create test scenarios in Kubernetes clusters, you can enhance the robustness and reliability of your applications. As we demonstrated, generating tests with Skyramp based on traces is easy. This proactive approach to testing ensures that your system not only meets current requirements but is also well-prepared for future challenges.

Embrace observability tracing as a fundamental aspect of your testing strategy with Skyramp, and watch as your Kubernetes applications become more resilient and performant in the face of evolving demands. Keep exploring and remember the sky's the limit. Happy tracing!

P.S. Please join the Skyramp Community Discord for any questions, comments, or feedback.

DEV Community

Observe This: One Step from Trace to Test with Skyramp

The Anatomy of Tracing

Two Key Components of Tracing

Traces Versus Logs

A Sample Trace

Generating Tests from Traces

Skyramp on the Scene

One Step Generation

Executing the Tests

Leaving with a Trace

Top comments (0)

Read next

Wi-Fi Monitoring using Kismet and OpenWrt [Tutorial]

Container Orchestration with Kubernetes

Understanding Kubernetes: part 53 – Kubernetes 1.32 Changelog

Comprehensive Guide to MongoDB Installation and Configuration

Okay