DEV Community: Souvik Dey

What is Distributed Tracing?

Souvik Dey — Mon, 20 Apr 2020 11:29:45 +0000

Current and ongoing reaction on distributed tracing has been varied and divergent. There's still that strong sense that distributed tracing is a massive investment with potentially limited returns for large organizations. This notion will fade out as we're progressing forward. For engineers debugging an issue where more than a few services are involved, distributed tracing becomes an inevitably invaluable tool.

So why do we care about distributed tracing? Companies have evolved their software architecture from monoliths to microservices. This evolution has resulted in the birth of large scale distributed systems.

We generally tend to face two operational challenges while dealing with distributed systems:

Networking
Observability

Networking

Managing networks in a monolithic application is a fairly simple task. The path between the client and the server is finite. This allows connectivity, performance, and security to be managed across a single or a limited set of flows. When dealing with distributed systems, the complexity of the network increases many-fold. It allows us to route transactions to the right place, scale up and down dynamically, and control access and authorization to disparate services. In the world of distributed systems, the path between client and application has got a lot more tortuous and difficult to reason about. This challenge is why tools like Envoy, Istio, and Consul are gaining traction as tools to manage distributed infrastructure connectivity.

Monolithic observability

Observability is all about understanding how transactions flow through the network and infrastructure. In a monolithic app, like a Java application, for example, it’s feasible to reason about the state and performance of the transactions. A client makes a web request, perhaps through a load balancer, to a web or application server, some DB transaction is usually created and a record queried or updated, and a response is generated back to the client.

Although there are hops in the entire process, transactions generally take a linear path from the client to the server.

Client initiates a1 request.
a1 request hits network.
a1 request terminates at load balancer.
a2 request originates from load balancer.
a2 request terminates at application server.
a3 request originates from application server.
..
a3 response terminates at client.

It's quite possible to instrument each of the hops and visualize the state of the transaction and its performance through the monolith.

More importantly, even through those hops, we can generally map a request and response transaction to an identifier through its life cycle in the application. We can see from the life of that transaction what happened to it, how it performed, identifying bottlenecks, etc.

Distributed systems observability

Distributed systems comprise of several microservices representing a complex entity. A transaction passes through multiple services and can trigger multiple DB transactions, other transactions or move back and forth among services.

Here the path of the transaction is quite different. We’ve had assumed that our system is located in a single virtual entity. On the contrary, a distributed system mostly exists in multiple virtual locations, consist of services in the edge and services distributed across regions.

Client initiates a0 request.
a0 sends a message to Service b.
c0 executes a local event.
a1 receives a message from Service b.
c1 executes a local event.
a2 executes a local event.
b1 sends a message to Service c.
c3 sends a message to Service b.
b3 sends a message to Service a.
etc…

This lack of a clear linear path makes it challenging and tedious to track transactions. It's often hard to map a client's request to a transaction since there is no single service. Relying on traditional and conventional tools for performance monitoring and bottlenecks might not provide clear insights on what exactly is going on where.

So how does distributed tracing help us?

Tracing tracks actions or events inside our applications, recording their timing and collecting other information about the nature of the action or event. To effectively use distributed tracing, we need to instrument our code to generate traces for actions we want to monitor, for instance, a HTTP request. The trace wraps the request and records the start and end time of the request and response cycle.

A span is the primary component of a trace. A span represents an individual unit of work done in a distributed system. Spans usually have a start and end time.

A trace constitutes more than one span. In our example above, our a2, b3, etc requests are spans in a trace. The spans are linked together via a trace ID. This makes it possible to build a view of the complete life cycle of a request as it propagates through the system.

Spans can also have user-defined annotations in the form of tags, to allow us to add metadata to the span to provide assistance in understanding where the trace is from and the context in which it was generated.

Finally, spans can also carry logs in the form of key:value pairs, useful for informational output from the application that sets some context or documents some specific event.

The OpenTracing documentation depicts an example of a typical span that illustrates the concept.

This trace data, with its spans and span context, is then supplied to a back end, where it is indexed and stored. It is then available for querying or to be displayed in a visualization tool such as Grafana.

How does distributed tracing fit into the infrastructure and monitoring strata?

Distributed tracers are monitoring tools and frameworks that instrument distributed systems. The landscape is relatively convoluted. Several companies have developed and released tools to address the issues, although they remain largely nascent at this stage.
Let's look at the first two principal tracing frameworks.

OpenCensus
OpenTracing

OpenCensus and OpenTracing are both tools and frameworks. They both attempt to produce a “standard”, although not a formal one, for distributed tracing.

OpenCensus

OpenCensus is a set of APIs, language support, and a spec, based on a Google tool called Census, for collecting metrics and traces from applications and exporting them to various back ends. OpenCensus provides a common context propagation format and a consistent way to instrument applications across multiple languages.

OpenTracing

An alternative to OpenCensus is OpenTracing. OpenTracing provides a similar framework, API, and libraries for tracing. It emerged out of Zipkin to provide a vendor-agnostic, cross-platform solution for tracing. Unlike OpenCensus it doesn’t have any support for metrics. A lot of the tools mentioned here, like Zipkin, Jaeger, and Appdash, have adopted OpenTracing’s specification. It’s also supported by commercial organizations like Datadog and is embraced by the Cloud Native Computing Foundation.

Tools

Let's look into a few tools which are more inclined towards monitoring:

Zipkin
Jaeger
Appdash

Zipkin

Zipkin was developed by Twitter and is written in Java and is open source. It supports Cassandra and Elasticsearch as back ends to store trace data. It implements Thrift as the communication protocol. Thrift is an RPC and communications protocol framework developed by Facebook and is hosted by the Apache Foundation.

Zipkin has a client-server architecture. It calls clients “reporters”, these are the components that instrument our applications. Reporters send data to collectors that index and store the trace and pass them into storage.

Zipkin’s slightly different from a classic client-server app though. To prevent a trace blocking, Zipkin only transmits a trace ID around to indicate a trace is happening. The actual data collected by the reporter gets sent to the collector asynchronously, much like many monitoring systems send metrics out-of-band. Zipkin is also equipped with a query interface/API and a web UI that we can use to query and explore traces.

Jaeger

Jaeger is the product of work at Uber. It’s also incubated by CNCF. It’s written in Go and like Zipkin uses Thrift to communicate, supports Cassandra and ElasticSearch as back ends, and is fully compatible with the OpenTracing project.

Jaeger works similarly to Zipkin but relies on sampling trace data to avoid being buried in information. It samples about 0.1% of instrumented requests, or 1 in 1000, using a probabilistic sampling algorithm. You can tweak this collection to get more or fewer data if required.

Like Zipkin, Jaeger has clients that instrument our code. Jaeger though has a local agent running on each host that receives the data from the clients and forwards it in batches to the collectors. A query API and Web UI provides an interface to the trace data.

Appdash

Like Jaeger, Appdash is open source and Go-based but created by the team at Sourcegraph. It also supports OpenTracing as a format. It hasn't been as mature as the other players and requires a bit more fiddling to get started with and lacks some of the documentation.

Appdash’s architecture is reminiscent of Jaeger, with clients instrumenting your code, a local agent collecting the traces, and a central server indexing and storing the trace data.

The idea behind this post is to give you a basic understanding of distributed tracing and why is it a necessity when dealing with multiple microservices. I encourage you to explore the above mentioned tools, list the major workflows in your applications and instrument them. This will soon serve as an immensely powerful window in understanding the end to end workflow of how your application behaves and performs.

Head to the link for the original post.

Redis diskless replication: What, how, why and the caveats

Souvik Dey — Wed, 11 Mar 2020 17:20:33 +0000

At DeepSource, we strive to run all internal infrastructure and services in High Availability mode. This ensues fault-tolerance, reliability, and resilience in deployments. Our Redis service runs as a three-node HA cluster, with one master, two slaves and a Redis Sentinel process which runs as an auxiliary process to initiate failover. Lately, we have rolled out diskless replication as a feature into production in our Redis cluster deployment, eliminating the need for persistence in Redis master node for replication. Before diving any further, let's get into some briefing.

Why Diskless Replication?

Diskless Replication is a feature introduced in Redis in version 2.8.18. Few have implemented it which can be attributed to the inherent fear of breaking down in production deployments.

Usually, when a slave breaks down or there is a network fault between the master and the slave, the master attempts to perform a partial resynchronization of the data to the slave. Essentially, the slave reconnects with the master and the replication proceeds incrementally, pulling the differences accumulated so far.

However, when the slave is disconnected for an extended period, or is restarted, or is an entirely new slave, the master needs to perform a full resynchronization. It is a fairly trivial concept, which means to transfer the entire master data set to the slave. The slave flushes the old data set and syncs the new data from scratch. After successful synchronization, successive changes are streamed as normal Redis commands, incrementally, as the master data set itself gets modified because of write commands sent by clients.

The problem arose when bulk transfers were needed to be made during full resynchronizations. A child process is created by the master to generate a Redis Database Backup (RDB) file (analogous to the SQL dump file). After the child process completes the RDB file generation, the file is transferred to the slaves using non-blocking I/O from the parent process. Finally, when the transfer is complete, slaves can reload the RDB file and go online, receiving the incremental stream of new writes.

However, to perform a full resynchronization, the master is required to 1) write data the RDB on disk 2) load back RDB from disk, to send it to slaves

With an improper setup, especially with non-local disks or because of a non-perfect kernel parameter tuning, the disk pressure can lead to latency spikes that are hard to deal with and thus slaves are required to restarted frequently, so it is impossible to avoid a full resynchronization. Thus enters diskless replication.

What is Diskless Replication?

So what is diskless replication? It is the process of transferring replicated stream of data directly to socket descriptors rather than storing it in the disk and serving it from the disk to the slave instances.

Serving multiple slaves concurrently

Initially, serving multiple slaves was tricky, since once the RDB transfer initiates, incoming slaves would have to wait for the current child process to finish writing to the current slave and move over to the new incoming slave.

To address this problem, the redis.conf file contains a parameter named repl-diskless-sync-delay. This parameter accepts its value in seconds. It sets a delay to permit incoming slaves to sync with the child process of master for mass resynchronization. This is important since once the transfer starts, it is not possible to serve new replicas arriving, which will be queued for the next RDB transfer, so the server waits for a delay to let more replicas arrive. The delay is specified in seconds, and the default is 5 seconds.

To facilitate this, the I/O code was redesigned to serve a multitude of file descriptors concurrently. Antirez devised the algorithm to resolve the problem. Moreover, in order to parallelize the data transfer even if blocking I/O is being used, the code will try to write a small amount of data to each of the socket descriptors in a loop, so that the kernel sends packets to multiple slaves concurrently.

while(len) {
        size_t count = len < 1024 ? len : 1024;
        int broken = 0;
        for (j = 0; j < r->io.fdset.numfds; j++) {
            … error checking removed …

            /* Make sure to write 'count' bytes to the socket regardless
             * of short writes. */
            size_t nwritten = 0;
            while(nwritten != count) {
                retval = write(r->io.fdset.fds[j],p+nwritten,count-nwritten);
                if (retval <= 0) {
                     … error checkign removed …
                }
                nwritten += retval;
            }
        }
        p += count;
        len -= count;
        r->io.fdset.pos += count;
        … more error checking removed …
    }

Handling partial failures

Writing to file descriptors isn't just the only dimension to this problem. A big chunk of it lies in actually handling a bunch of slaves without actually requiring to block the process for other incoming slaves.

However, when the RDB is terminated, the child needs to perform feedback of slaves which have received the RDB and can continue with the replication streaming process. The child process returns an array of slave IDs and their associated error states, thus enabling the parent process to log the error states of the slaves.

Caveats

The apparent problem with diskless replication is that writing to disks differs from writing to sockets.

The API is different since the Redis Database Backup code conventionally writes to C file pointers while our situation demands writing to sockets, which is basically writing to socket descriptors.
Disk writes primarily don't tend to fail, if not for super hard I/O errors (if the disk is full and so on). For sockets though, it's a different ball game altogether, since writes can get delayed as the receiver could get slow and the local kernel buffer could get full.
The problem of timeouts is ever expanding in the realm of sockets. What if the receiving end fails to receive packets due to a breakdown, or just the TCP connection is dead.

According to Salvatore Sanfilippo (aka antirez), the author of Redis, there were two options in front of him to mitigate the issue.

generate the RDB file inside memory and then perform the transfer.
write to the sockets directly and incrementally, as the RDB is being generated.

Way 1 was riskier as it had the overhead of too much memory consumption. The feature had to be targeted for environments with slow disks, but with faster networks and higher bandwidths, without consuming too much memory. Hence way 2 was selected.

Morphing the Redis replication landscape

The idea of replication without persistence is definitely overwhelming and intimidating, but Redis just made it through. Supporting replication in a non-disk replication removes undesirable storage moving parts, and as we are all aware of the fact that disk I/Os are slow and sluggish. Implementing this in our Kubernetes ecosystem has made significant improvements in I/O and caching metrics and has made our Redis deployments leaner and meaner.

Here is the link to the actual post.

How to setup Vault with Kubernetes

Souvik Dey — Tue, 18 Feb 2020 06:26:32 +0000

In our not so ideal world, we tend to leave all our application secrets (passwords, API tokens) exposed in our source code. Storing secrets in plain sight isn't such a good idea, is it? We, at DeepSource have embraced the issue by incorporating a robust secrets management system in our infrastructure from day one. This post explains how to setup secret management in Kubernetes with HashiCorp Vault.

What is Vault?

Vault acts as your centrally managed service which deals with encryption and storage of your entire infrastructure secrets. Vault manages all secrets in secret engines. Vault has a suite of secrets engines at its disposal, but for the sake of brevity, we will stick to the kv (key-value) secret engine.

Overview

The above design depicts a three-node Vault cluster with one active node, two standby nodes and a Consul agent sidecar deployed talking on behalf of the Vault node to the five-node Consul server cluster. The architecture can also be extended to a multi-availability zone, rendering your cluster to be highly fault-tolerant.

You might be wondering why are we using the Consul server when the architecture is already a bit complex to wrap your head around. Vault requires a backend to store all encrypted data at rest. It can be your filesystem backend, a cloud provider, a database or a Consul cluster.

The strength of Consul is that it is fault-tolerant and highly scalable. By using Consul as a backend to Vault, you get the best of both. Consul is used for durable storage of encrypted data at rest and provides coordination so that Vault can be highly available and fault-tolerant. Vault provides higher-level policy management, secret leasing, audit logging, and automatic revocation.

The client talks to the Vault server through HTTPS, the Vault server processes the requests and forwards it to the Consul agent on a loopback address. The Consul client agents serve as an interface to the Consul server, are very lightweight and maintain very little state of their own. The Consul server stores the secrets encrypted at rest.

The Consul server cluster is essentially odd-numbered, as they are required to maintain consistency and fault tolerance using the consensus protocol. The consensus protocol is primarily based on Raft: In search of an Understandable Consensus Algorithm. For a visual explanation of Raft, you can refer to The Secret Lives of Data.

Vault on Kubernetes - easing your way out of operational complexities

Almost all of DeepSource's infrastructure runs on Kubernetes. From analysis runs to VPN infrastructure, everything runs on a highly distributed environment and Kubernetes helps us achieve that. For setting up Vault in Kubernetes, Hashicorp highly recommends using Helm charts for Vault and Consul deployment on Kubernetes, rather than using mundane manifests.

Prerequisites

For this setup, we'll require kubectl and helm installed, along with a local minikube setup to deploy into.

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.3", GitCommit:"b3cbbae08ec52a7fc73d334838e18d17e8512749", GitTreeState:"clean", BuildDate:"2019-11-14T04:24:29Z", GoVersion:"go1.12.13", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"14+", GitVersion:"v1.14.8-gke.33", GitCommit:"2c6d0ee462cee7609113bf9e175c107599d5213f", GitTreeState:"clean", BuildDate:"2020-01-15T17:47:46Z", GoVersion:"go1.12.11b4", Compiler:"gc", Platform:"linux/amd64"}

$ helm version
version.BuildInfo{Version:"v3.0.1", GitCommit:"7c22ef9ce89e0ebeb7125ba2ebf7d421f3e82ffa", GitTreeState:"clean", GoVersion:"go1.13.4"}

$ minikube version
minikube version: v1.5.2
commit: 792dbf92a1de583fcee76f8791cff12e0c9440ad

The setup

Let's get minikube up and running.

$ minikube start --memory 4096
😄  minikube v1.5.2 on Darwin 10.15.2
✨  Automatically selected the 'hyperkit' driver (alternates: [virtualbox])
🔥  Creating hyperkit VM (CPUs=2, Memory=4096MB, Disk=20000MB) ...
🐳  Preparing Kubernetes v1.16.2 on Docker '18.09.9' ...
🚜  Pulling images ...
🚀  Launching Kubernetes ...
⌛  Waiting for: apiserver
🏄  Done! kubectl is now configured to use "minikube"

The --memory is set to 4096 MB to ensure there is enough memory for all the resources to be deployed. The initialization process takes several minutes as it retrieves necessary dependencies and starts downloads multiple container images.

Verify the status of your Minikube cluster,

$ minikube status
host: Running
kubelet: Running
apiserver: Running
kubeconfig: Configured

The host, kubelet, apiserver should report that they are running. On successful execution, kubectl will be auto configured to communicate with this recently started cluster.

The recommended way to run Vault on Kubernetes is with the Helm chart. This installs and configures all the necessary components to run Vault in several different modes. Let's install Vault Helm chart (this post deploys version 0.3.3) with pods prefixed with the name vault:

$ helm install --name vault \
    --set "server.dev.enabled=true" \
    https://github.com/hashicorp/vault-helm/archive/v0.3.0.tar.gz
NAME:   vault
LAST DEPLOYED: Fri Feb 8 11:56:33 2020
NAMESPACE: default
STATUS: DEPLOYED

RESOURCES:
..

NOTES:
..

Your release is named vault. To learn more about the release, try:

  $ helm status vault
  $ helm get vault

To verify, get all the pods within the default namespace:

$ kubectl get pods
NAME                                    READY   STATUS    RESTARTS   AGE
vault-0                                 1/1     Running   0          80s
vault-agent-injector-5945fb98b5-tpglz   1/1     Running   0          80s

Creating a secret

The applications that you deploy in the later steps expect Vault to store a username and password stored at the path internal/database/config. To create this secret requires that a kv secret engine is enabled and a username and password is put at the specified path.

Start an interactive shell session on the vault-0 pod:

$ kubectl exec -it vault-0 /bin/sh
/ $

Your system prompt is replaced with a new prompt / $. Commands issued at this prompt are executed on the vault-0 container.

Enable kv-v2 secrets at the path internal:

/ $ vault secrets enable -path=internal kv-v2
Success! Enabled the kv-v2 secrets engine at: internal/

Add a username and password secret at the path internal/exampleapp/config:

$ vault kv put internal/database/config username="db-readonly-username" password="db-secret-password"
Key              Value
---              -----
created_time     2019-12-20T18:17:01.719862753Z
deletion_time    n/a
destroyed        false
version          1

Verify that the secret is defined at the path internal/database/config:

$ vault kv get internal/database/config
====== Metadata ======
Key              Value
---              -----
created_time     2019-12-20T18:17:50.930264759Z
deletion_time    n/a
destroyed        false
version          1

====== Data ======
Key         Value
---         -----
password    db-secret-password
username    db-readonly-username

Make Kubernetes familiar for Vault

Vault provides a Kubernetes authentication method that enables clients to authenticate with a Kubernetes Service Account Token.

Enable the Kubernetes authentication method:

/ $ vault auth enable kubernetes
Success! Enabled kubernetes auth method at: kubernetes/

Vault accepts this service token from any client within the Kubernetes cluster. During authentication, Vault verifies that the service account token is valid by querying a configured Kubernetes endpoint.

Configure the Kubernetes authentication method to use the service account token, the location of the Kubernetes host, and its certificate:

/ $ vault write auth/kubernetes/config \
        token_reviewer_jwt="$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" \
        kubernetes_host="https://$KUBERNETES_PORT_443_TCP_ADDR:443" \
        kubernetes_ca_cert=@/var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Success! Data written to: auth/kubernetes/config

The token_reviewer_jwt and kubernetes_ca_cert reference files written to the container by Kubernetes. The environment variable KUBERNETES_PORT_443_TCP_ADDR references the internal network address of the Kubernetes host. For a client to read the secret data defined in the previous step, at internal/database/config, requires that the read capability be granted for the path internal/data/database/config.

Write out the policy named internal-app that enables the read capability for secrets at path internal/data/database/config

/ $ vault policy write internal-app - <<EOH
path "internal/data/database/config" {
  capabilities = ["read"]
}
EOH
Success! Uploaded policy: internal-app

Now, create a Kubernetes authentication role named internal-app:

/ $ vault write auth/kubernetes/role/internal-app \
        bound_service_account_names=internal-app \
        bound_service_account_namespaces=default \
        policies=internal-app \
        ttl=24h
Success! Data written to: auth/kubernetes/role/internal-app

The role connects the Kubernetes service account, internal-app, and namespace, default, with the Vault policy, internal-app. The tokens returned after authentication are valid for 24 hours.

Lastly, exit the vault-0 pod:

/ $ exit
$

Create a Kubernetes service account

The Vault Kubernetes authentication role defined a Kubernetes service account named internal-app. This service account does not yet exist.

View the service account defined in exampleapp-service-account.yml:

$ kubectl get serviceaccounts
NAME                   SECRETS   AGE
default                1         43m
vault                  1         34m
vault-agent-injector   1         34m

Apply the service account definition to create it:

$ kubectl apply --filename service-account-internal-app.yml
serviceaccount/internal-app created

Verify that the service account has been created:
The name of the service account here aligns with the name assigned to the bound_service_account_names field when creating the internal-app role when configuring the Kubernetes authentication.

Secret Injection from sidecar to application

View the deployment for the orgchart application:

$ cat deployment-01-orgchart.yml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: orgchart
  labels:
    app: vault-agent-injector-demo
spec:
  selector:
    matchLabels:
      app: vault-agent-injector-demo
  replicas: 1
  template:
    metadata:
      annotations:
      labels:
        app: vault-agent-injector-demo
    spec:
      serviceAccountName: internal-app
      containers:
        - name: orgchart
          image: jweissig/app:0.0.1

The name of the new deployment is orgchart. The spec.template.spec.serviceAccountName defines the service account internal-app to run this container under.

Apply the deployment defined in deployment-01-orgchart.yml:

$ kubectl apply --filename deployment-01-orgchart.yml
deployment.apps/orgchart created

The application runs as a pod within the default namespace.

Get all the pods within the default namespace:

$ kubectl get pods
NAME                                    READY   STATUS    RESTARTS   AGE
orgchart-69697d9598-l878s               1/1     Running   0          18s
vault-0                                 1/1     Running   0          58m
vault-agent-injector-5945fb98b5-tpglz   1/1     Running   0          58m

The Vault-Agent injector looks for deployments that define specific annotations. None of these annotations exist within the current deployment. This means that no secrets are present on the orgchart container within the orgchart-69697d9598-l878s pod.

Verify that no secrets are written to the orgchart container in the orgchart-69697d9598-l878s pod:

$ kubectl exec orgchart-69697d9598-l878s --container orgchart -- ls /vault/secrets
ls: /vault/secrets: No such file or directory
command terminated with exit code 1

The deployment is running the pod with the internal-app Kubernetes service account in the default namespace. The Vault Agent injector only modifies a deployment if it contains a very specific set of annotations. An existing deployment may have its definition patched to include the necessary annotations.

View the deployment patch deployment-02-inject-secrets.yml:

$ cat deployment-02-inject-secrets.yml
spec:
  template:
    metadata:
      annotations:
        vault.hashicorp.com/agent-inject: "true"
        vault.hashicorp.com/role: "internal-app"
        vault.hashicorp.com/agent-inject-secret-database-config.txt: "internal/data/database/config"

These annotations define a partial structure of the deployment schema and are prefixed with vault.hashicorp.com.

agent-inject enables the Vault Agent injector service
role is the Vault Kubernetes authentication role
role is the Vault role created that maps back to the K8s service account
agent-inject-secret-FIlEPATH prefixes the path of the file, database-config.txt written to /vault/secrets. The values is the path to the secret defined in Vault.

Patch the orgchart deployment defined in deployment-02-inject-secrets.yml:

$ kubectl patch deployment orgchart --patch "$(cat deployment-02-inject-secrets.yml)"
deployment.apps/orgchart patched

This new pod now launches two containers. The application container, named orgchart, and the Vault Agent container, named vault-agent.

View the logs of the vault-agent container in the orgchart-599cb74d9c-s8hhm pod:

$ kubectl logs orgchart-599cb74d9c-s8hhm --container vault-agent
==> Vault server started! Log data will stream in below:

==> Vault agent configuration:

                     Cgo: disabled
               Log Level: info
                 Version: Vault v1.3.1

2019-12-20T19:52:36.658Z [INFO]  sink.file: creating file sink
2019-12-20T19:52:36.659Z [INFO]  sink.file: file sink configured: path=/home/vault/.token mode=-rw-r-----
2019-12-20T19:52:36.659Z [INFO]  template.server: starting template server
2019/12/20 19:52:36.659812 [INFO] (runner) creating new runner (dry: false, once: false)
2019/12/20 19:52:36.660237 [INFO] (runner) creating watcher
2019-12-20T19:52:36.660Z [INFO]  auth.handler: starting auth handler
2019-12-20T19:52:36.660Z [INFO]  auth.handler: authenticating
2019-12-20T19:52:36.660Z [INFO]  sink.server: starting sink server
2019-12-20T19:52:36.679Z [INFO]  auth.handler: authentication successful, sending token to sinks
2019-12-20T19:52:36.680Z [INFO]  auth.handler: starting renewal process
2019-12-20T19:52:36.681Z [INFO]  sink.file: token written: path=/home/vault/.token
2019-12-20T19:52:36.681Z [INFO]  template.server: template server received new token
2019/12/20 19:52:36.681133 [INFO] (runner) stopping
2019/12/20 19:52:36.681160 [INFO] (runner) creating new runner (dry: false, once: false)
2019/12/20 19:52:36.681285 [INFO] (runner) creating watcher
2019/12/20 19:52:36.681342 [INFO] (runner) starting
2019-12-20T19:52:36.692Z [INFO]  auth.handler: renewed auth token

Vault Agent manages the token lifecycle and the secret retrieval. The secret is rendered in the orgchart container at the path /vault/secrets/database-config.txt.

Finally, view the secret written to the orgchart container:

$ kubectl exec orgchart-599cb74d9c-s8hhm --container orgchart -- cat /vault/secrets/database-config.txt
data: map[password:db-secret-password username:db-readonly-user]
metadata: map[created_time:2019-12-20T18:17:50.930264759Z deletion_time: destroyed:false version:2]

The secret is successfully present in the container. Secrets injected into the container can further be templatized to suit the application needs.

Right now, you should very well assess the significance of Vault in a highly dynamic cloud native infrastructure, removing operational overheads in managing application and service secrets and permitting your infrastructure to scale gracefully.

Here is the link to the actual post.