<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Souvik Dey</title>
    <description>The latest articles on DEV Community by Souvik Dey (@sadjunky).</description>
    <link>https://dev.to/sadjunky</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F337442%2Fa6a6d851-b2d8-4f36-b491-494cc9d5d38d.jpeg</url>
      <title>DEV Community: Souvik Dey</title>
      <link>https://dev.to/sadjunky</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sadjunky"/>
    <language>en</language>
    <item>
      <title>What is Distributed Tracing?</title>
      <dc:creator>Souvik Dey</dc:creator>
      <pubDate>Mon, 20 Apr 2020 11:29:45 +0000</pubDate>
      <link>https://dev.to/deepsource/what-is-distributed-tracing-14a7</link>
      <guid>https://dev.to/deepsource/what-is-distributed-tracing-14a7</guid>
      <description>&lt;p&gt;Current and ongoing reaction on distributed tracing has been varied and divergent. There's still that strong sense that distributed tracing is a massive investment with potentially limited returns for large organizations. This notion will fade out as we're progressing forward. For engineers debugging an issue where more than a few services are involved, distributed tracing becomes an inevitably invaluable tool.&lt;/p&gt;

&lt;p&gt;So why do we care about &lt;strong&gt;distributed tracing&lt;/strong&gt;? Companies have evolved their software architecture from monoliths to microservices. This evolution has resulted in the birth of large scale distributed systems.&lt;/p&gt;

&lt;p&gt;We generally tend to face two operational challenges while dealing with distributed systems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Networking&lt;/li&gt;
&lt;li&gt;Observability&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Networking
&lt;/h2&gt;

&lt;p&gt;Managing networks in a monolithic application is a fairly simple task. The path between the client and the server is finite. This allows connectivity, performance, and security to be managed across a single or a limited set of flows. When dealing with distributed systems, the complexity of the network increases many-fold. It allows us to route transactions to the right place, scale up and down dynamically, and control access and authorization to disparate services. In the world of distributed systems, the path between client and application has got a lot more tortuous and difficult to reason about. This challenge is why tools like &lt;strong&gt;Envoy&lt;/strong&gt;, &lt;strong&gt;Istio&lt;/strong&gt;, and &lt;strong&gt;Consul&lt;/strong&gt; are gaining traction as tools to manage distributed infrastructure connectivity.&lt;/p&gt;

&lt;h2&gt;
  
  
  Monolithic observability
&lt;/h2&gt;

&lt;p&gt;Observability is all about understanding how transactions flow through the network and infrastructure.  In a monolithic app, like a Java &lt;span class="x x-first x-last"&gt;application&lt;/span&gt;, for example, it’s feasible to reason about the state and performance of &lt;span class="x x-first x-last"&gt;the&lt;/span&gt; transactions. A client makes a web request, perhaps through a load balancer, to a web or application server, some DB transaction is usually created and a record queried or updated, and a response is generated back to the client.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Hd8Wx9rQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://assets.deepsource.io/4d07ad0/images/blog/distributed-tracing/monolith.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Hd8Wx9rQ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://assets.deepsource.io/4d07ad0/images/blog/distributed-tracing/monolith.png" alt="monolith"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Although there are hops in the entire process, transactions generally take a linear path from the client to the server.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Z8AxXdwR--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://assets.deepsource.io/4d07ad0/images/blog/distributed-tracing/monolith-trace.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Z8AxXdwR--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://assets.deepsource.io/4d07ad0/images/blog/distributed-tracing/monolith-trace.png" alt="monolith-trace"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Client initiates &lt;code&gt;a1&lt;/code&gt; request.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;a1&lt;/code&gt; request hits network.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;a1&lt;/code&gt; request terminates at load balancer.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;a2&lt;/code&gt; request originates from load balancer.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;a2&lt;/code&gt; request terminates at application server.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;a3&lt;/code&gt; request originates from application server.&lt;/li&gt;
&lt;li&gt;..&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;a3&lt;/code&gt; response terminates at client.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;It's quite possible to instrument each of the hops and visualize the state of the transaction and its performance through the monolith. &lt;/p&gt;

&lt;p&gt;More importantly, even through those hops, we can generally map a request and response transaction to an identifier through its life cycle in the application. We can see from the life of that transaction what happened to it, how it performed, identifying bottlenecks, etc.&lt;/p&gt;

&lt;h2&gt;
  
  
  Distributed systems observability
&lt;/h2&gt;

&lt;p&gt;Distributed systems comprise of several microservices representing a complex entity. A transaction passes through multiple services and can trigger multiple DB transactions, other transactions or move back and forth among services.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--sih36FvX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://assets.deepsource.io/4d07ad0/images/blog/distributed-tracing/distributed.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--sih36FvX--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://assets.deepsource.io/4d07ad0/images/blog/distributed-tracing/distributed.png" alt="distributed"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here the path of the transaction is quite different. We’ve had assumed that our system is located in a single virtual entity. On the contrary, a distributed system mostly exists in multiple virtual locations, consist of services in the edge and services distributed across regions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--tOXwJ0_J--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://assets.deepsource.io/4d07ad0/images/blog/distributed-tracing/distributed-trace.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--tOXwJ0_J--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://assets.deepsource.io/4d07ad0/images/blog/distributed-tracing/distributed-trace.png" alt="distributed-trace"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Client initiates &lt;code&gt;a0&lt;/code&gt; request.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;a0&lt;/code&gt; sends a message to Service b.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;c0&lt;/code&gt; executes a local event.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;a1&lt;/code&gt; receives a message from Service b.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;c1&lt;/code&gt; executes a local event.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;a2&lt;/code&gt; executes a local event.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;b1&lt;/code&gt; sends a message to Service c.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;c3&lt;/code&gt; sends a message to Service b.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;b3&lt;/code&gt; sends a message to Service a.&lt;/li&gt;
&lt;li&gt;etc…&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This lack of a clear linear path makes it challenging and tedious to track transactions. It's often hard to map a client's request to a transaction since there is no single service. Relying on traditional and conventional tools for performance monitoring and bottlenecks might not provide clear insights on what exactly is going on where.&lt;/p&gt;

&lt;h2&gt;
  
  
  So how does distributed tracing help us?
&lt;/h2&gt;

&lt;p&gt;Tracing tracks actions or events inside our applications, recording their timing and collecting other information about the nature of the action or event. To effectively use distributed tracing, we need to instrument our code to generate traces for actions we want to monitor, for instance, a HTTP request. The &lt;strong&gt;trace&lt;/strong&gt; wraps the request and records the start and end time of the request and response cycle.&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;span&lt;/strong&gt; is the primary component of a trace. A span represents an individual unit of work done in a distributed system. Spans usually have a start and end time.&lt;/p&gt;

&lt;p&gt;A trace constitutes more than one span. In our example above, our &lt;code&gt;a2&lt;/code&gt;, &lt;code&gt;b3&lt;/code&gt;, etc requests are spans in a trace. The spans are linked together via a &lt;strong&gt;trace ID&lt;/strong&gt;. This makes it possible to build a view of the complete life cycle of a request as it propagates through the system.&lt;/p&gt;

&lt;p&gt;Spans can also have user-defined annotations in the form of tags, to allow us to add metadata to the span to provide assistance in understanding where the trace is from and the context in which it was generated.&lt;/p&gt;

&lt;p&gt;Finally, spans can also carry logs in the form of key:value pairs, useful for informational output from the application that sets some context or documents some specific event.&lt;/p&gt;

&lt;p&gt;The OpenTracing documentation depicts an example of a typical span that illustrates the concept.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Nwml-6-0--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://assets.deepsource.io/4d07ad0/images/blog/distributed-tracing/opentracing.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Nwml-6-0--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://assets.deepsource.io/4d07ad0/images/blog/distributed-tracing/opentracing.png" alt="opentacing"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This trace data, with its spans and span context, is then supplied to a back end, where it is indexed and stored. It is then available for querying or to be displayed in a visualization tool such as Grafana.&lt;/p&gt;

&lt;h2&gt;
  
  
  How does distributed tracing fit into the infrastructure and monitoring strata?
&lt;/h2&gt;

&lt;p&gt;Distributed tracers are monitoring tools and frameworks that instrument distributed systems. The landscape is relatively convoluted. Several companies have developed and released tools to address the issues, although they remain largely nascent at this stage.&lt;br&gt;
Let's look at the first two principal tracing frameworks.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;OpenCensus&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;OpenTracing&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;OpenCensus and OpenTracing are both tools and frameworks. They both attempt to produce a “standard”, although not a formal one, for distributed tracing.&lt;/p&gt;

&lt;h3&gt;
  
  
  OpenCensus
&lt;/h3&gt;

&lt;p&gt;OpenCensus is a set of APIs, language support, and a spec, based on a Google tool called Census, for collecting metrics and traces from applications and exporting them to various back ends. OpenCensus provides a common context propagation format and a consistent way to instrument applications across multiple languages.&lt;/p&gt;

&lt;h3&gt;
  
  
  OpenTracing
&lt;/h3&gt;

&lt;p&gt;An alternative to OpenCensus is OpenTracing. OpenTracing provides a similar framework, API, and libraries for tracing. It emerged out of Zipkin to provide a vendor-agnostic, cross-platform solution for tracing. Unlike OpenCensus it doesn’t have any support for metrics. A lot of the tools mentioned here, like &lt;strong&gt;Zipkin&lt;/strong&gt;, &lt;strong&gt;Jaeger&lt;/strong&gt;, and &lt;strong&gt;Appdash&lt;/strong&gt;, have adopted OpenTracing’s specification. It’s also supported by commercial organizations like Datadog and is embraced by the Cloud Native Computing Foundation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tools
&lt;/h2&gt;

&lt;p&gt;Let's look into a few tools which are more inclined towards monitoring:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Zipkin&lt;/li&gt;
&lt;li&gt;Jaeger&lt;/li&gt;
&lt;li&gt;Appdash&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Zipkin
&lt;/h3&gt;

&lt;p&gt;Zipkin was developed by &lt;strong&gt;Twitter&lt;/strong&gt; and is written in Java and is open source. It supports Cassandra and Elasticsearch as back ends to store trace data. It implements &lt;strong&gt;Thrift&lt;/strong&gt; as the communication protocol. Thrift is an RPC and communications protocol framework developed by Facebook and is hosted by the Apache Foundation.&lt;/p&gt;

&lt;p&gt;Zipkin has a client-server architecture. It calls clients “&lt;em&gt;reporters&lt;/em&gt;”, these are the components that instrument our applications. Reporters send data to collectors that index and store the trace and pass them into storage.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--yRw1fbJT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://assets.deepsource.io/4d07ad0/images/blog/distributed-tracing/zipkin.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--yRw1fbJT--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://assets.deepsource.io/4d07ad0/images/blog/distributed-tracing/zipkin.png" alt="zipkin"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Zipkin’s slightly different from a classic client-server app though. To prevent a trace blocking, Zipkin only transmits a trace ID around to indicate a trace is happening. The actual data collected by the reporter gets sent to the collector asynchronously, much like many monitoring systems send metrics out-of-band. Zipkin is also equipped with a query interface/API and a web UI that we can use to query and explore traces.&lt;/p&gt;

&lt;h3&gt;
  
  
  Jaeger
&lt;/h3&gt;

&lt;p&gt;Jaeger is the product of work at &lt;strong&gt;Uber&lt;/strong&gt;. It’s also incubated by &lt;a href="https://www.cncf.io/"&gt;CNCF&lt;/a&gt;. It’s written in Go and like Zipkin uses &lt;strong&gt;Thrift&lt;/strong&gt; to communicate, supports Cassandra and ElasticSearch as back ends, and is fully compatible with the OpenTracing project.&lt;/p&gt;

&lt;p&gt;Jaeger works similarly to Zipkin but relies on sampling trace data to avoid being buried in information. It samples about 0.1% of instrumented requests, or 1 in 1000, using a probabilistic sampling algorithm. You can tweak this collection to get more or fewer data if required.&lt;/p&gt;

&lt;p&gt;Like Zipkin, Jaeger has clients that instrument our code. Jaeger though has a local agent running on each host that receives the data from the clients and forwards it in batches to the collectors. A query API and Web UI provides an interface to the trace data.&lt;/p&gt;

&lt;h3&gt;
  
  
  Appdash
&lt;/h3&gt;

&lt;p&gt;Like Jaeger, Appdash is open source and Go-based but created by the team at &lt;strong&gt;Sourcegraph&lt;/strong&gt;. It also supports OpenTracing as a format. It hasn't been as mature as the other players and requires a bit more fiddling to get started with and lacks some of the documentation.&lt;/p&gt;

&lt;p&gt;Appdash’s architecture is reminiscent of Jaeger, with clients instrumenting your code, a local agent collecting the traces, and a central server indexing and storing the trace data.&lt;/p&gt;

&lt;p&gt;The idea behind this post is to give you a basic understanding of distributed tracing and why is it a necessity when dealing with multiple microservices. I encourage you to explore the above mentioned tools, list the major workflows in your applications and instrument them. This will soon serve as an immensely powerful window in understanding the end to end workflow of how your application behaves and performs.&lt;/p&gt;

&lt;p&gt;Head to the &lt;a href="https://deepsource.io/blog/distributed-tracing/"&gt;link&lt;/a&gt; for the original post.&lt;/p&gt;

</description>
      <category>distributedsystems</category>
      <category>devops</category>
      <category>microservices</category>
    </item>
    <item>
      <title>Redis diskless replication: What, how, why and the caveats</title>
      <dc:creator>Souvik Dey</dc:creator>
      <pubDate>Wed, 11 Mar 2020 17:20:33 +0000</pubDate>
      <link>https://dev.to/deepsource/redis-diskless-replication-what-how-why-and-the-caveats-3gfi</link>
      <guid>https://dev.to/deepsource/redis-diskless-replication-what-how-why-and-the-caveats-3gfi</guid>
      <description>&lt;p&gt;At DeepSource, we strive to run all internal infrastructure and services in High Availability mode. This ensues fault-tolerance, reliability, and resilience in deployments. Our Redis service runs as a three-node HA cluster, with one master, two slaves and a Redis Sentinel process which runs as an auxiliary process to initiate failover. Lately, we have rolled out &lt;strong&gt;diskless replication&lt;/strong&gt; as a feature into production in our Redis cluster deployment, eliminating the need for persistence in Redis master node for replication. Before diving any further, let's get into some briefing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Diskless Replication?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdeepsource.io%2Fimages%2Fblog%2Fredis-diskless-replication%2Fdiskless.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdeepsource.io%2Fimages%2Fblog%2Fredis-diskless-replication%2Fdiskless.png" alt="redis"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Diskless Replication is a feature introduced in Redis in version &lt;strong&gt;2.8.18&lt;/strong&gt;. Few have implemented it which can be attributed to the inherent fear of breaking down in production deployments. &lt;/p&gt;

&lt;p&gt;Usually, when a slave breaks down or there is a network fault between the master and the slave, the master attempts to perform a &lt;strong&gt;partial resynchronization&lt;/strong&gt; of the data to the slave. Essentially, the slave reconnects with the master and the replication proceeds incrementally, pulling the differences accumulated so far. &lt;/p&gt;

&lt;p&gt;However, when the slave is disconnected for an extended period, or is restarted, or is an entirely new slave, the master needs to perform a &lt;strong&gt;full resynchronization&lt;/strong&gt;. It is a fairly trivial concept, which means to transfer the entire master data set to the slave. The slave flushes the old data set and syncs the new data from scratch. After successful synchronization, successive changes are streamed as normal Redis commands, incrementally, as the master data set itself gets modified because of write commands sent by clients.&lt;/p&gt;

&lt;p&gt;The problem arose when bulk transfers were needed to be made during full resynchronizations. A child process is created by the master to generate a &lt;strong&gt;Redis Database Backup (RDB)&lt;/strong&gt; file (analogous to the SQL dump file). After the child process completes the RDB file generation, the file is transferred to the slaves using non-blocking I/O from the parent process. Finally, when the transfer is complete, slaves can reload the RDB file and go online, receiving the incremental stream of new writes.&lt;/p&gt;

&lt;p&gt;However, to perform a full resynchronization, the master is required to &lt;strong&gt;1) write data the RDB on disk 2) load back RDB from disk, to send it to slaves&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;With an improper setup, especially with non-local disks or because of a non-perfect kernel parameter tuning, the disk pressure can lead to latency spikes that are hard to deal with and thus slaves are required to restarted frequently, so it is impossible to avoid a full resynchronization. Thus enters diskless replication.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Diskless Replication?
&lt;/h2&gt;

&lt;p&gt;So what is diskless replication? It is the process of transferring replicated stream of data directly to socket descriptors rather than storing it in the disk and serving it from the disk to the slave instances.&lt;/p&gt;

&lt;h3&gt;
  
  
  Serving multiple slaves concurrently
&lt;/h3&gt;

&lt;p&gt;Initially, serving multiple slaves was tricky, since once the RDB transfer initiates, incoming slaves would have to wait for the current child process to finish writing to the current slave and move over to the new incoming slave.&lt;/p&gt;

&lt;p&gt;To address this problem, the &lt;code&gt;redis.conf&lt;/code&gt; file contains a parameter named &lt;code&gt;repl-diskless-sync-delay&lt;/code&gt;. This parameter accepts its value in seconds. It sets a delay to permit incoming slaves to sync with the child process of master for &lt;strong&gt;mass resynchronization&lt;/strong&gt;. This is important since once the transfer starts, it is not possible to serve new replicas arriving, which will be queued for the next RDB transfer, so the server waits for a delay to let more replicas arrive. The delay is specified in seconds, and the default is &lt;strong&gt;5&lt;/strong&gt; seconds.&lt;/p&gt;

&lt;p&gt;To facilitate this, the I/O code was redesigned to serve a multitude of file descriptors concurrently. Antirez devised the algorithm to resolve the problem. Moreover, in order to parallelize the data transfer even if blocking I/O is being used, the code will try to write a small amount of data to each of the socket descriptors in a loop, so that the kernel sends packets to multiple slaves concurrently.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;while(len) {
        size_t count = len &amp;lt; 1024 ? len : 1024;
        int broken = 0;
        for (j = 0; j &amp;lt; r-&amp;gt;io.fdset.numfds; j++) {
            … error checking removed …

            /* Make sure to write 'count' bytes to the socket regardless
             * of short writes. */
            size_t nwritten = 0;
            while(nwritten != count) {
                retval = write(r-&amp;gt;io.fdset.fds[j],p+nwritten,count-nwritten);
                if (retval &amp;lt;= 0) {
                     … error checkign removed …
                }
                nwritten += retval;
            }
        }
        p += count;
        len -= count;
        r-&amp;gt;io.fdset.pos += count;
        … more error checking removed …
    }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Handling partial failures
&lt;/h3&gt;

&lt;p&gt;Writing to file descriptors isn't just the only dimension to this problem. A big chunk of it lies in actually handling a bunch of slaves without actually requiring to block the process for other incoming slaves.&lt;/p&gt;

&lt;p&gt;However, when the RDB is terminated, the child needs to perform feedback of slaves which have received the RDB and can continue with the replication streaming process. The child process returns an array of slave IDs and their associated error states, thus enabling the parent process to log the error states of the slaves.&lt;/p&gt;

&lt;h2&gt;
  
  
  Caveats
&lt;/h2&gt;

&lt;p&gt;The apparent problem with diskless replication is that writing to disks differs from writing to sockets.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The API is different since the Redis Database Backup code conventionally writes to C file pointers while our situation demands writing to sockets, which is basically writing to socket descriptors.&lt;/li&gt;
&lt;li&gt;Disk writes primarily don't tend to fail, if not for super hard I/O errors (if the disk is full and so on). For sockets though, it's a different ball game altogether, since writes can get delayed as the receiver could get slow and the local kernel buffer could get full.&lt;/li&gt;
&lt;li&gt;The problem of timeouts is ever expanding in the realm of sockets. What if the receiving end fails to receive packets due to a breakdown, or just the TCP connection is dead.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;According to &lt;strong&gt;Salvatore Sanfilippo&lt;/strong&gt; (aka &lt;strong&gt;antirez&lt;/strong&gt;), the author of Redis, there were two options in front of him to mitigate the issue.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;generate the RDB file inside memory and then perform the transfer.&lt;/li&gt;
&lt;li&gt;write to the sockets directly and incrementally, as the RDB is being generated.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Way 1 was riskier as it had the overhead of too much memory consumption. The feature had to be targeted for environments with slow disks, but with faster networks and higher bandwidths, without consuming too much memory. Hence way 2 was selected.&lt;/p&gt;

&lt;h2&gt;
  
  
  Morphing the Redis replication landscape
&lt;/h2&gt;

&lt;p&gt;The idea of replication without persistence is definitely overwhelming and intimidating, but Redis just made it through. Supporting replication in a non-disk replication removes undesirable storage moving parts, and as we are all aware of the fact that disk I/Os are slow and sluggish. Implementing this in our Kubernetes ecosystem has made significant improvements in I/O and caching metrics and has made our Redis deployments leaner and meaner.&lt;/p&gt;

&lt;p&gt;Here is the &lt;a href="https://deepsource.io/blog/redis-diskless-replication/" rel="noopener noreferrer"&gt;link&lt;/a&gt; to the actual post.&lt;/p&gt;

</description>
      <category>redis</category>
      <category>devops</category>
      <category>systems</category>
      <category>infrastructure</category>
    </item>
    <item>
      <title>How to setup Vault with Kubernetes</title>
      <dc:creator>Souvik Dey</dc:creator>
      <pubDate>Tue, 18 Feb 2020 06:26:32 +0000</pubDate>
      <link>https://dev.to/deepsource/how-to-setup-vault-with-kubernetes-ig9</link>
      <guid>https://dev.to/deepsource/how-to-setup-vault-with-kubernetes-ig9</guid>
      <description>&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--qOiwNoGZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://assets.deepsource.io/3299902/images/blog/secrets-vault/hero.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--qOiwNoGZ--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://assets.deepsource.io/3299902/images/blog/secrets-vault/hero.png" alt="hero"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In our not so ideal world, we tend to leave all our application secrets (passwords, API tokens) exposed in our source code. Storing secrets in plain sight isn't such a good idea, is it? We, at DeepSource have embraced the issue by incorporating a robust secrets management system in our infrastructure from day one. This post explains how to setup secret management in Kubernetes with HashiCorp Vault.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Vault?
&lt;/h2&gt;

&lt;p&gt;Vault acts as your centrally managed service which deals with encryption and storage of your entire infrastructure secrets. Vault manages all secrets in secret engines. Vault has a suite of secrets engines at its disposal, but for the sake of brevity, we will stick to the kv (key-value) secret engine.&lt;/p&gt;

&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--3viFQhAY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://deepsource.io/images/blog/secrets-vault/vault-consul-cluster.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--3viFQhAY--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://deepsource.io/images/blog/secrets-vault/vault-consul-cluster.png" alt="Vault-Consul-Cluster"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The above design depicts a three-node Vault cluster with one active node, two standby nodes and a Consul agent sidecar deployed talking on behalf of the Vault node to the five-node Consul server cluster. The architecture can also be extended to a multi-availability zone, rendering your cluster to be highly fault-tolerant.&lt;/p&gt;

&lt;p&gt;You might be wondering why are we using the Consul server when the architecture is already a bit complex to wrap your head around. Vault requires a backend to store all encrypted data at rest. It can be your filesystem backend, a cloud provider, a database or a Consul cluster.&lt;/p&gt;

&lt;p&gt;The strength of Consul is that it is fault-tolerant and highly scalable. By using Consul as a backend to Vault, you get the best of both. Consul is used for durable storage of encrypted data at rest and provides coordination so that Vault can be highly available and fault-tolerant. Vault provides higher-level policy management, secret leasing, audit logging, and automatic revocation.&lt;/p&gt;

&lt;p&gt;The client talks to the Vault server through HTTPS, the Vault server processes the requests and forwards it to the Consul agent on a loopback address. The Consul client agents serve as an interface to the Consul server, are very lightweight and maintain very little state of their own. The Consul server stores the secrets encrypted at rest.&lt;/p&gt;

&lt;p&gt;The Consul server cluster is essentially odd-numbered, as they are required to maintain consistency and fault tolerance using the consensus protocol. The consensus protocol is primarily based on &lt;a href="https://raft.github.io/raft.pdf"&gt;Raft: In search of an Understandable Consensus Algorithm&lt;/a&gt;. For a visual explanation of Raft, you can refer to &lt;a href="http://thesecretlivesofdata.com/raft/"&gt;The Secret Lives of Data&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Vault on Kubernetes - easing your way out of operational complexities
&lt;/h2&gt;

&lt;p&gt;Almost all of DeepSource's infrastructure runs on Kubernetes. From analysis runs to VPN infrastructure, everything runs on a highly distributed environment and Kubernetes helps us achieve that. For setting up Vault in Kubernetes, Hashicorp highly recommends using Helm charts for Vault and Consul deployment on Kubernetes, rather than using mundane manifests.&lt;/p&gt;

&lt;h3&gt;
  
  
  Prerequisites
&lt;/h3&gt;

&lt;p&gt;For this setup, we'll require &lt;a href="https://kubernetes.io/docs/tasks/tools/install-kubectl/"&gt;kubectl&lt;/a&gt; and &lt;a href="https://helm.sh/docs/intro/install/"&gt;helm&lt;/a&gt; installed, along with a local &lt;a href="https://kubernetes.io/docs/tasks/tools/install-minikube/"&gt;minikube&lt;/a&gt; setup to deploy into.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;kubectl version
Client Version: version.Info&lt;span class="o"&gt;{&lt;/span&gt;Major:&lt;span class="s2"&gt;"1"&lt;/span&gt;, Minor:&lt;span class="s2"&gt;"16"&lt;/span&gt;, GitVersion:&lt;span class="s2"&gt;"v1.16.3"&lt;/span&gt;, GitCommit:&lt;span class="s2"&gt;"b3cbbae08ec52a7fc73d334838e18d17e8512749"&lt;/span&gt;, GitTreeState:&lt;span class="s2"&gt;"clean"&lt;/span&gt;, BuildDate:&lt;span class="s2"&gt;"2019-11-14T04:24:29Z"&lt;/span&gt;, GoVersion:&lt;span class="s2"&gt;"go1.12.13"&lt;/span&gt;, Compiler:&lt;span class="s2"&gt;"gc"&lt;/span&gt;, Platform:&lt;span class="s2"&gt;"darwin/amd64"&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
Server Version: version.Info&lt;span class="o"&gt;{&lt;/span&gt;Major:&lt;span class="s2"&gt;"1"&lt;/span&gt;, Minor:&lt;span class="s2"&gt;"14+"&lt;/span&gt;, GitVersion:&lt;span class="s2"&gt;"v1.14.8-gke.33"&lt;/span&gt;, GitCommit:&lt;span class="s2"&gt;"2c6d0ee462cee7609113bf9e175c107599d5213f"&lt;/span&gt;, GitTreeState:&lt;span class="s2"&gt;"clean"&lt;/span&gt;, BuildDate:&lt;span class="s2"&gt;"2020-01-15T17:47:46Z"&lt;/span&gt;, GoVersion:&lt;span class="s2"&gt;"go1.12.11b4"&lt;/span&gt;, Compiler:&lt;span class="s2"&gt;"gc"&lt;/span&gt;, Platform:&lt;span class="s2"&gt;"linux/amd64"&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;





&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;helm version
version.BuildInfo&lt;span class="o"&gt;{&lt;/span&gt;Version:&lt;span class="s2"&gt;"v3.0.1"&lt;/span&gt;, GitCommit:&lt;span class="s2"&gt;"7c22ef9ce89e0ebeb7125ba2ebf7d421f3e82ffa"&lt;/span&gt;, GitTreeState:&lt;span class="s2"&gt;"clean"&lt;/span&gt;, GoVersion:&lt;span class="s2"&gt;"go1.13.4"&lt;/span&gt;&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;





&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;minikube version
minikube version: v1.5.2
commit: 792dbf92a1de583fcee76f8791cff12e0c9440ad
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  The setup
&lt;/h3&gt;

&lt;p&gt;Let's get minikube up and running.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;minikube start &lt;span class="nt"&gt;--memory&lt;/span&gt; 4096
😄  minikube v1.5.2 on Darwin 10.15.2
✨  Automatically selected the &lt;span class="s1"&gt;'hyperkit'&lt;/span&gt; driver &lt;span class="o"&gt;(&lt;/span&gt;alternates: &lt;span class="o"&gt;[&lt;/span&gt;virtualbox]&lt;span class="o"&gt;)&lt;/span&gt;
🔥  Creating hyperkit VM &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;CPUs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2, &lt;span class="nv"&gt;Memory&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4096MB, &lt;span class="nv"&gt;Disk&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;20000MB&lt;span class="o"&gt;)&lt;/span&gt; ...
🐳  Preparing Kubernetes v1.16.2 on Docker &lt;span class="s1"&gt;'18.09.9'&lt;/span&gt; ...
🚜  Pulling images ...
🚀  Launching Kubernetes ...
⌛  Waiting &lt;span class="k"&gt;for&lt;/span&gt;: apiserver
🏄  Done! kubectl is now configured to use &lt;span class="s2"&gt;"minikube"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;--memory&lt;/code&gt; is set to 4096 MB to ensure there is enough memory for all the resources to be deployed. The initialization process takes several minutes as it retrieves necessary dependencies and starts downloads multiple container images.&lt;/p&gt;

&lt;p&gt;Verify the status of your Minikube cluster,&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;minikube status
host: Running
kubelet: Running
apiserver: Running
kubeconfig: Configured
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;host&lt;/code&gt;, &lt;code&gt;kubelet&lt;/code&gt;, &lt;code&gt;apiserver&lt;/code&gt; should report that they are running. On successful execution, &lt;code&gt;kubectl&lt;/code&gt; will be auto configured to communicate with this recently started cluster.&lt;/p&gt;

&lt;p&gt;The recommended way to run Vault on Kubernetes is with the &lt;a href="https://github.com/hashicorp/vault-helm"&gt;Helm chart&lt;/a&gt;. This installs and configures all the necessary components to run Vault in several different modes. Let's install Vault Helm chart (this post deploys version 0.3.3) with pods prefixed with the name &lt;code&gt;vault&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;helm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--name&lt;/span&gt; vault &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="s2"&gt;"server.dev.enabled=true"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    https://github.com/hashicorp/vault-helm/archive/v0.3.0.tar.gz
NAME:   vault
LAST DEPLOYED: Fri Feb 8 11:56:33 2020
NAMESPACE: default
STATUS: DEPLOYED

RESOURCES:
..

NOTES:
..

Your release is named vault. To learn more about the release, try:

  &lt;span class="nv"&gt;$ &lt;/span&gt;helm status vault
  &lt;span class="nv"&gt;$ &lt;/span&gt;helm get vault
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;To verify, get all the pods within the default namespace:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;kubectl get pods
NAME                                    READY   STATUS    RESTARTS   AGE
vault-0                                 1/1     Running   0          80s
vault-agent-injector-5945fb98b5-tpglz   1/1     Running   0          80s
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h4&gt;
  
  
  Creating a secret
&lt;/h4&gt;

&lt;p&gt;The applications that you deploy in the later steps expect Vault to store a username and password stored at the path &lt;code&gt;internal/database/config&lt;/code&gt;. To create this secret requires that a kv secret engine is enabled and a username and password is put at the specified path.&lt;/p&gt;

&lt;p&gt;Start an interactive shell session on the &lt;code&gt;vault-0&lt;/code&gt; pod:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;kubectl &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; vault-0 /bin/sh
/ &lt;span class="err"&gt;$&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Your system prompt is replaced with a new prompt &lt;code&gt;/ $&lt;/code&gt;. Commands issued at this prompt are executed on the &lt;code&gt;vault-0&lt;/code&gt; container.&lt;/p&gt;

&lt;p&gt;Enable kv-v2 secrets at the path internal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;/ &lt;span class="nv"&gt;$ &lt;/span&gt;vault secrets &lt;span class="nb"&gt;enable&lt;/span&gt; &lt;span class="nt"&gt;-path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;internal kv-v2
Success! Enabled the kv-v2 secrets engine at: internal/
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Add a username and password secret at the path &lt;code&gt;internal/exampleapp/config&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;vault kv put internal/database/config &lt;span class="nv"&gt;username&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"db-readonly-username"&lt;/span&gt; &lt;span class="nv"&gt;password&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"db-secret-password"&lt;/span&gt;
Key              Value
&lt;span class="nt"&gt;---&lt;/span&gt;              &lt;span class="nt"&gt;-----&lt;/span&gt;
created_time     2019-12-20T18:17:01.719862753Z
deletion_time    n/a
destroyed        &lt;span class="nb"&gt;false
&lt;/span&gt;version          1
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Verify that the secret is defined at the path &lt;code&gt;internal/database/config&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;vault kv get internal/database/config
&lt;span class="o"&gt;======&lt;/span&gt; Metadata &lt;span class="o"&gt;======&lt;/span&gt;
Key              Value
&lt;span class="nt"&gt;---&lt;/span&gt;              &lt;span class="nt"&gt;-----&lt;/span&gt;
created_time     2019-12-20T18:17:50.930264759Z
deletion_time    n/a
destroyed        &lt;span class="nb"&gt;false
&lt;/span&gt;version          1

&lt;span class="o"&gt;======&lt;/span&gt; Data &lt;span class="o"&gt;======&lt;/span&gt;
Key         Value
&lt;span class="nt"&gt;---&lt;/span&gt;         &lt;span class="nt"&gt;-----&lt;/span&gt;
password    db-secret-password
username    db-readonly-username
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h4&gt;
  
  
  Make Kubernetes familiar for Vault
&lt;/h4&gt;

&lt;p&gt;Vault provides a &lt;a href="https://www.vaultproject.io/docs/auth/kubernetes.html"&gt;Kubernetes authentication&lt;/a&gt; method that enables clients to authenticate with a Kubernetes Service Account Token.&lt;/p&gt;

&lt;p&gt;Enable the Kubernetes authentication method:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;/ &lt;span class="nv"&gt;$ &lt;/span&gt;vault auth &lt;span class="nb"&gt;enable &lt;/span&gt;kubernetes
Success! Enabled kubernetes auth method at: kubernetes/
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Vault accepts this service token from any client within the Kubernetes cluster. During authentication, Vault verifies that the service account token is valid by querying a configured Kubernetes endpoint.&lt;/p&gt;

&lt;p&gt;Configure the Kubernetes authentication method to use the service account token, the location of the Kubernetes host, and its certificate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;/ &lt;span class="nv"&gt;$ &lt;/span&gt;vault write auth/kubernetes/config &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nv"&gt;token_reviewer_jwt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; /var/run/secrets/kubernetes.io/serviceaccount/token&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nv"&gt;kubernetes_host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"https://&lt;/span&gt;&lt;span class="nv"&gt;$KUBERNETES_PORT_443_TCP_ADDR&lt;/span&gt;&lt;span class="s2"&gt;:443"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nv"&gt;kubernetes_ca_cert&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;@/var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Success! Data written to: auth/kubernetes/config
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;token_reviewer_jwt&lt;/code&gt; and &lt;code&gt;kubernetes_ca_cert&lt;/code&gt; reference files written to the container by Kubernetes. The environment variable &lt;code&gt;KUBERNETES_PORT_443_TCP_ADDR&lt;/code&gt; references the internal network address of the Kubernetes host. For a client to read the secret data defined in the previous step, at internal/database/config, requires that the read capability be granted for the path &lt;code&gt;internal/data/database/config&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Write out the policy named &lt;code&gt;internal-app&lt;/code&gt; that enables the read capability for secrets at path &lt;code&gt;internal/data/database/config&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;/ &lt;span class="nv"&gt;$ &lt;/span&gt;vault policy write internal-app - &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="no"&gt;EOH&lt;/span&gt;&lt;span class="sh"&gt;
path "internal/data/database/config" {
  capabilities = ["read"]
}
&lt;/span&gt;&lt;span class="no"&gt;EOH
&lt;/span&gt;Success! Uploaded policy: internal-app
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Now, create a Kubernetes authentication role named internal-app:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;/ &lt;span class="nv"&gt;$ &lt;/span&gt;vault write auth/kubernetes/role/internal-app &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nv"&gt;bound_service_account_names&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;internal-app &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nv"&gt;bound_service_account_namespaces&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;default &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nv"&gt;policies&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;internal-app &lt;span class="se"&gt;\&lt;/span&gt;
        &lt;span class="nv"&gt;ttl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;24h
Success! Data written to: auth/kubernetes/role/internal-app
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The role connects the Kubernetes service account, &lt;code&gt;internal-app&lt;/code&gt;, and namespace, &lt;code&gt;default&lt;/code&gt;, with the Vault policy, &lt;code&gt;internal-app&lt;/code&gt;. The tokens returned after authentication are valid for &lt;strong&gt;24&lt;/strong&gt; hours.&lt;/p&gt;

&lt;p&gt;Lastly, exit the vault-0 pod:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;/ &lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;exit&lt;/span&gt;
&lt;span class="err"&gt;$&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;h3&gt;
  
  
  Create a Kubernetes service account
&lt;/h3&gt;

&lt;p&gt;The Vault Kubernetes authentication role defined a Kubernetes service account named &lt;code&gt;internal-app&lt;/code&gt;. This service account does not yet exist.&lt;/p&gt;

&lt;p&gt;View the service account defined in &lt;code&gt;exampleapp-service-account.yml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;kubectl get serviceaccounts
NAME                   SECRETS   AGE
default                1         43m
vault                  1         34m
vault-agent-injector   1         34m
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Apply the service account definition to create it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;kubectl apply &lt;span class="nt"&gt;--filename&lt;/span&gt; service-account-internal-app.yml
serviceaccount/internal-app created
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Verify that the service account has been created:&lt;br&gt;
The name of the service account here aligns with the name assigned to the &lt;code&gt;bound_service_account_names&lt;/code&gt; field when creating the &lt;code&gt;internal-app&lt;/code&gt; role when configuring the Kubernetes authentication.&lt;/p&gt;
&lt;h4&gt;
  
  
  Secret Injection from sidecar to application
&lt;/h4&gt;

&lt;p&gt;View the deployment for the orgchart application:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;cat &lt;/span&gt;deployment-01-orgchart.yml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: orgchart
  labels:
    app: vault-agent-injector-demo
spec:
  selector:
    matchLabels:
      app: vault-agent-injector-demo
  replicas: 1
  template:
    metadata:
      annotations:
      labels:
        app: vault-agent-injector-demo
    spec:
      serviceAccountName: internal-app
      containers:
        - name: orgchart
          image: jweissig/app:0.0.1
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The name of the new deployment is &lt;code&gt;orgchart&lt;/code&gt;. The &lt;code&gt;spec.template.spec.serviceAccountName&lt;/code&gt; defines the service account &lt;code&gt;internal-app&lt;/code&gt; to run this container under.&lt;/p&gt;

&lt;p&gt;Apply the deployment defined in &lt;code&gt;deployment-01-orgchart.yml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;kubectl apply &lt;span class="nt"&gt;--filename&lt;/span&gt; deployment-01-orgchart.yml
deployment.apps/orgchart created
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The application runs as a pod within the &lt;code&gt;default&lt;/code&gt; namespace.&lt;/p&gt;

&lt;p&gt;Get all the pods within the &lt;code&gt;default&lt;/code&gt; namespace:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;kubectl get pods
NAME                                    READY   STATUS    RESTARTS   AGE
orgchart-69697d9598-l878s               1/1     Running   0          18s
vault-0                                 1/1     Running   0          58m
vault-agent-injector-5945fb98b5-tpglz   1/1     Running   0          58m
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The Vault-Agent injector looks for deployments that define specific annotations. None of these annotations exist within the current deployment. This means that no secrets are present on the orgchart container within the &lt;code&gt;orgchart-69697d9598-l878s&lt;/code&gt; pod.&lt;/p&gt;

&lt;p&gt;Verify that no secrets are written to the &lt;code&gt;orgchart&lt;/code&gt; container in the &lt;code&gt;orgchart-69697d9598-l878s&lt;/code&gt; pod:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;kubectl &lt;span class="nb"&gt;exec &lt;/span&gt;orgchart-69697d9598-l878s &lt;span class="nt"&gt;--container&lt;/span&gt; orgchart &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="nb"&gt;ls&lt;/span&gt; /vault/secrets
&lt;span class="nb"&gt;ls&lt;/span&gt;: /vault/secrets: No such file or directory
&lt;span class="nb"&gt;command &lt;/span&gt;terminated with &lt;span class="nb"&gt;exit &lt;/span&gt;code 1
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The deployment is running the pod with the internal-app Kubernetes service account in the default namespace. The Vault Agent injector only modifies a deployment if it contains a very specific set of annotations. An existing deployment may have its definition patched to include the necessary annotations.&lt;/p&gt;

&lt;p&gt;View the deployment patch &lt;code&gt;deployment-02-inject-secrets.yml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;&lt;span class="nb"&gt;cat &lt;/span&gt;deployment-02-inject-secrets.yml
spec:
  template:
    metadata:
      annotations:
        vault.hashicorp.com/agent-inject: &lt;span class="s2"&gt;"true"&lt;/span&gt;
        vault.hashicorp.com/role: &lt;span class="s2"&gt;"internal-app"&lt;/span&gt;
        vault.hashicorp.com/agent-inject-secret-database-config.txt: &lt;span class="s2"&gt;"internal/data/database/config"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;These &lt;a href="https://www.vaultproject.io/docs/platform/k8s/injector/index.html#annotations"&gt;annotations&lt;/a&gt; define a partial structure of the deployment schema and are prefixed with vault.hashicorp.com.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;agent-inject&lt;/code&gt; enables the Vault Agent injector service&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;role&lt;/code&gt; is the Vault Kubernetes authentication role&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;role&lt;/code&gt; is the Vault role created that maps back to the K8s service account&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;agent-inject-secret-FIlEPATH&lt;/code&gt; prefixes the path of the file, &lt;code&gt;database-config.txt&lt;/code&gt; written to &lt;code&gt;/vault/secrets&lt;/code&gt;. The values is the path to the secret defined in Vault.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Patch the orgchart deployment defined in &lt;code&gt;deployment-02-inject-secrets.yml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;kubectl patch deployment orgchart &lt;span class="nt"&gt;--patch&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;cat &lt;/span&gt;deployment-02-inject-secrets.yml&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
deployment.apps/orgchart patched
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;This new pod now launches two containers. The application container, named &lt;code&gt;orgchart&lt;/code&gt;, and the Vault Agent container, named &lt;code&gt;vault-agent&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;View the logs of the vault-agent container in the &lt;code&gt;orgchart-599cb74d9c-s8hhm&lt;/code&gt; pod:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;kubectl logs orgchart-599cb74d9c-s8hhm &lt;span class="nt"&gt;--container&lt;/span&gt; vault-agent
&lt;span class="o"&gt;==&amp;gt;&lt;/span&gt; Vault server started! Log data will stream &lt;span class="k"&gt;in &lt;/span&gt;below:

&lt;span class="o"&gt;==&amp;gt;&lt;/span&gt; Vault agent configuration:

                     Cgo: disabled
               Log Level: info
                 Version: Vault v1.3.1

2019-12-20T19:52:36.658Z &lt;span class="o"&gt;[&lt;/span&gt;INFO]  sink.file: creating file sink
2019-12-20T19:52:36.659Z &lt;span class="o"&gt;[&lt;/span&gt;INFO]  sink.file: file sink configured: &lt;span class="nv"&gt;path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/home/vault/.token &lt;span class="nv"&gt;mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nt"&gt;-rw-r-----&lt;/span&gt;
2019-12-20T19:52:36.659Z &lt;span class="o"&gt;[&lt;/span&gt;INFO]  template.server: starting template server
2019/12/20 19:52:36.659812 &lt;span class="o"&gt;[&lt;/span&gt;INFO] &lt;span class="o"&gt;(&lt;/span&gt;runner&lt;span class="o"&gt;)&lt;/span&gt; creating new runner &lt;span class="o"&gt;(&lt;/span&gt;dry: &lt;span class="nb"&gt;false&lt;/span&gt;, once: &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
2019/12/20 19:52:36.660237 &lt;span class="o"&gt;[&lt;/span&gt;INFO] &lt;span class="o"&gt;(&lt;/span&gt;runner&lt;span class="o"&gt;)&lt;/span&gt; creating watcher
2019-12-20T19:52:36.660Z &lt;span class="o"&gt;[&lt;/span&gt;INFO]  auth.handler: starting auth handler
2019-12-20T19:52:36.660Z &lt;span class="o"&gt;[&lt;/span&gt;INFO]  auth.handler: authenticating
2019-12-20T19:52:36.660Z &lt;span class="o"&gt;[&lt;/span&gt;INFO]  sink.server: starting sink server
2019-12-20T19:52:36.679Z &lt;span class="o"&gt;[&lt;/span&gt;INFO]  auth.handler: authentication successful, sending token to sinks
2019-12-20T19:52:36.680Z &lt;span class="o"&gt;[&lt;/span&gt;INFO]  auth.handler: starting renewal process
2019-12-20T19:52:36.681Z &lt;span class="o"&gt;[&lt;/span&gt;INFO]  sink.file: token written: &lt;span class="nv"&gt;path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/home/vault/.token
2019-12-20T19:52:36.681Z &lt;span class="o"&gt;[&lt;/span&gt;INFO]  template.server: template server received new token
2019/12/20 19:52:36.681133 &lt;span class="o"&gt;[&lt;/span&gt;INFO] &lt;span class="o"&gt;(&lt;/span&gt;runner&lt;span class="o"&gt;)&lt;/span&gt; stopping
2019/12/20 19:52:36.681160 &lt;span class="o"&gt;[&lt;/span&gt;INFO] &lt;span class="o"&gt;(&lt;/span&gt;runner&lt;span class="o"&gt;)&lt;/span&gt; creating new runner &lt;span class="o"&gt;(&lt;/span&gt;dry: &lt;span class="nb"&gt;false&lt;/span&gt;, once: &lt;span class="nb"&gt;false&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
2019/12/20 19:52:36.681285 &lt;span class="o"&gt;[&lt;/span&gt;INFO] &lt;span class="o"&gt;(&lt;/span&gt;runner&lt;span class="o"&gt;)&lt;/span&gt; creating watcher
2019/12/20 19:52:36.681342 &lt;span class="o"&gt;[&lt;/span&gt;INFO] &lt;span class="o"&gt;(&lt;/span&gt;runner&lt;span class="o"&gt;)&lt;/span&gt; starting
2019-12-20T19:52:36.692Z &lt;span class="o"&gt;[&lt;/span&gt;INFO]  auth.handler: renewed auth token
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Vault Agent manages the token lifecycle and the secret retrieval. The secret is rendered in the orgchart container at the path &lt;code&gt;/vault/secrets/database-config.txt&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Finally, view the secret written to the orgchart container:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;kubectl &lt;span class="nb"&gt;exec &lt;/span&gt;orgchart-599cb74d9c-s8hhm &lt;span class="nt"&gt;--container&lt;/span&gt; orgchart &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="nb"&gt;cat&lt;/span&gt; /vault/secrets/database-config.txt
data: map[password:db-secret-password username:db-readonly-user]
metadata: map[created_time:2019-12-20T18:17:50.930264759Z deletion_time: destroyed:false version:2]
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;The secret is successfully present in the container. Secrets injected into the container can further be templatized to suit the application needs.&lt;/p&gt;

&lt;p&gt;Right now, you should very well assess the significance of Vault in a highly dynamic cloud native infrastructure, removing operational overheads in managing application and service secrets and permitting your infrastructure to scale gracefully.&lt;/p&gt;

&lt;p&gt;Here is the &lt;a href="https://deepsource.io/blog/setup-vault-kubernetes/"&gt;link&lt;/a&gt; to the actual post.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>security</category>
      <category>architecture</category>
      <category>devops</category>
    </item>
  </channel>
</rss>
