<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Joshua Varghese</title>
    <description>The latest articles on DEV Community by Joshua Varghese (@joshuabvarghese).</description>
    <link>https://dev.to/joshuabvarghese</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2391526%2Fb63c44d5-d212-44bd-af5a-7684e5e2f257.png</url>
      <title>DEV Community: Joshua Varghese</title>
      <link>https://dev.to/joshuabvarghese</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/joshuabvarghese"/>
    <language>en</language>
    <item>
      <title>HAProxy's Zero-Downtime Reload</title>
      <dc:creator>Joshua Varghese</dc:creator>
      <pubDate>Sat, 09 Nov 2024 12:01:54 +0000</pubDate>
      <link>https://dev.to/joshuabvarghese/haproxys-zero-downtime-reload-312j</link>
      <guid>https://dev.to/joshuabvarghese/haproxys-zero-downtime-reload-312j</guid>
      <description>&lt;h2&gt;
  
  
  HAProxy's Reload Architecture
&lt;/h2&gt;

&lt;p&gt;HAProxy uses a sophisticated socket transfer mechanism between old and new processes. This design choice leads to some interesting trade-offs and benefits.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Master CLI Process
&lt;/h2&gt;

&lt;p&gt;HAProxy's architecture includes a master CLI process that orchestrates the reload:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Example of HAProxy's master-worker socket handling */
static int proc_self_pipe[2];

void master_register_worker(struct worker *w) {
    struct listener *listener;

    list_for_each_entry(listener, &amp;amp;w-&amp;gt;listeners, list) {
        /* Transfer listener sockets to the new worker */
        if (listener-&amp;gt;state == LI_READY) {
            listener_transfer_fd(listener, w);
        }
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Socket Transfer Magic
&lt;/h2&gt;

&lt;p&gt;One of the most interesting aspects of HAProxy's implementation is how it handles socket transfers. The process involves several key steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Socket Preparation
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;static int listener_transfer_fd(struct listener *l, struct worker *w) {
    struct cmsg_fd_list fdlist;
    struct msghdr msg;
    struct iovec iov[1];
    int ret;

    /* Prepare socket data structure */
    memset(&amp;amp;msg, 0, sizeof(msg));
    msg.msg_iov = iov;
    msg.msg_iovlen = 1;

    /* Set up control message for FD passing */
    msg.msg_control = &amp;amp;fdlist;
    msg.msg_controllen = sizeof(fdlist);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Graceful Connection Handover
HAProxy ensures existing connections aren't disrupted during the reload:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;void perform_soft_reload(void) {
    /* 1. Keep accepting new connections in old process */
    while (nb_running_tasks() &amp;gt; 0) {
        /* 2. Process existing connections */
        process_runnable_tasks();

        /* 3. Check if we can stop */
        if (should_exit()) {
            break;
        }
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  State Management During Reload
&lt;/h2&gt;

&lt;p&gt;HAProxy's state management during reload is particularly clever. It handles several key aspects:&lt;/p&gt;

&lt;h2&gt;
  
  
  Connection State Preservation
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;struct connection {
    unsigned int flags;     /* Status flags */
    enum obj_type *target;  /* What this connection is about */
    void *ctx;             /* Application specific context */
    /* ... other fields ... */
};

/* During transfer */
static void transfer_connection_state(struct connection *conn) {
    /* Save essential connection data */
    struct conn_state state = {
        .flags = conn-&amp;gt;flags,
        .protocol_state = conn-&amp;gt;ctx,
        .ssl_state = conn-&amp;gt;ssl_ctx
    };

    /* Transfer to new process */
    send_state_to_new_process(&amp;amp;state);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  SSL Session Handling
&lt;/h2&gt;

&lt;p&gt;A particularly tricky part is managing SSL sessions during reload:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;static int ssl_session_transfer(SSL *ssl, struct worker *new_worker) {
    unsigned char *session_data;
    unsigned int session_len;

    /* Serialize SSL session */
    session_data = ssl_serialize_session(ssl, &amp;amp;session_len);
    if (!session_data)
        return -1;

    /* Transfer to new process */
    return send_session_to_worker(new_worker, session_data, session_len);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Configuration Validation
&lt;/h2&gt;

&lt;p&gt;Before any reload happens, HAProxy performs extensive configuration validation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;int check_config_validity(char *file) {
    struct proxy *curproxy = NULL;
    int cfgerr = 0;

    /* Parse and validate configuration */
    for (curproxy = proxy_list; curproxy; curproxy = curproxy-&amp;gt;next) {
        /* Check proxy settings */
        cfgerr += proxy_cfg_check(curproxy);

        /* Validate SSL configurations */
        cfgerr += ssl_cfg_check(curproxy);

        /* Check backend server configurations */
        cfgerr += check_backend_cfg(curproxy);
    }

    return cfgerr;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Health Check Continuity
&lt;/h2&gt;

&lt;p&gt;One often-overlooked aspect is maintaining health check states during reload:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;struct check {
    unsigned int status;   /* health check status */
    unsigned int result;   /* test result */
    int code;             /* status code */
    int duration;         /* time it took to get the result */
};

void transfer_health_checks(void) {
    struct server *srv;
    struct check *check;

    /* Iterate through all servers */
    for (srv = servers; srv; srv = srv-&amp;gt;next) {
        check = &amp;amp;srv-&amp;gt;check;

        /* Save health check state */
        if (check-&amp;gt;status &amp;amp; CHK_ST_ENABLED) {
            save_check_state(check);
        }
    }
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Performance Considerations
&lt;/h2&gt;

&lt;p&gt;HAProxy's reload mechanism is designed with performance in mind. Here are some key optimizations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Minimal Memory Overhead: Only essential state is transferred&lt;/li&gt;
&lt;li&gt;Efficient FD Passing: Uses kernel mechanisms for file descriptor transfer&lt;/li&gt;
&lt;li&gt;Progressive Transfer: Connections are handled gradually to avoid spikes&lt;/li&gt;
&lt;/ul&gt;

</description>
    </item>
    <item>
      <title>Understanding Envoy Proxy's Hot Restart Implementation</title>
      <dc:creator>Joshua Varghese</dc:creator>
      <pubDate>Sat, 09 Nov 2024 11:45:16 +0000</pubDate>
      <link>https://dev.to/joshuabvarghese/understanding-envoy-proxys-hot-restart-implementation-a-deep-dive-4a7o</link>
      <guid>https://dev.to/joshuabvarghese/understanding-envoy-proxys-hot-restart-implementation-a-deep-dive-4a7o</guid>
      <description>&lt;p&gt;As modern distributed systems grow in complexity, the ability to update proxy configurations without dropping active connections has become crucial. In this post, I'll break down how Envoy Proxy implements its hot restart mechanism, a feature that allows seamless configuration updates and binary upgrades without disrupting existing connections.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Hot Restart?
&lt;/h2&gt;

&lt;p&gt;Hot restart (or hot reload) is a mechanism that allows a proxy server to reload its configuration or upgrade its binary while maintaining existing client connections. This is achieved by having the new process take over the listening sockets and existing connections from the old process, ensuring zero connection drops during the transition.&lt;/p&gt;

&lt;h2&gt;
  
  
  Envoy's Approach
&lt;/h2&gt;

&lt;p&gt;Envoy implements hot restart through a parent-child process model, where the parent process manages the handover of socket descriptors to the new child process. Here's how it works:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Shared Memory Architecture
Envoy uses shared memory to facilitate communication between the old and new processes. This is implemented in the HotRestartImpl class:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class HotRestartImpl {
private:
  static constexpr uint64_t MAX_STAT_SEGMENTS = 256;
  SharedMemory* shmem_;
  Stats::StatDataAllocator* stats_allocator_;
};
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;Socket Passing Process
The hot restart process follows these key steps:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Initialize Shared Memory: The parent process creates a shared memory segment that both processes can access.&lt;br&gt;
Socket Duplication: The parent process duplicates its listening sockets.&lt;br&gt;
Graceful Handover: Traffic is gradually transferred to the new process.&lt;/p&gt;

&lt;p&gt;Here's a simplified version of how Envoy handles socket passing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class HotRestartingChild {
public:
  void initialize(int argc, char** argv) {
    // Request parent's listen sockets
    std::vector&amp;lt;int&amp;gt; fds = parent_.retrieveListenSockets();

    // Initialize new server with inherited sockets
    for (int fd : fds) {
      Server::createListenerFromSocket(fd);
    }

    // Signal ready to parent
    parent_.sendReady();
  }
};
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;State Transfer
One of the most critical aspects is transferring the state of existing connections:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;void HotRestartImpl::drainListeners() {
  // 1. Stop accepting new connections
  for (auto&amp;amp; listener : listeners_) {
    listener-&amp;gt;stopAcceptingConnections();
  }


  // 2. Wait for existing connections to complete
  while (hasActiveConnections()) {
    std::this_thread::sleep_for(std::chrono::milliseconds(100));
  }

  // 3. Signal completion to new process
  notifyNewProcess();
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Key Implementation Challenges
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;File Descriptor Handling
Envoy needs to carefully manage file descriptors to ensure they're properly transferred and not leaked:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Uses SCM_RIGHTS to pass file descriptors between processes&lt;br&gt;
Maintains a registry of active file descriptors&lt;br&gt;
Implements careful cleanup mechanisms&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Connection State Management
The proxy must maintain connection state during the transition:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;TCP connection parameters&lt;br&gt;
TLS session information&lt;br&gt;
Protocol-specific state (HTTP/2 streams, etc.)&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Configuration Compatibility
Envoy ensures that configuration changes are compatible with existing connections:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;bool HotRestartImpl::validateConfig(
  const envoy::config::bootstrap::v3::Bootstrap&amp;amp; new_config) {
  // Verify that critical fields haven't changed
  // Check listener compatibility
  // Validate cluster configurations
  return isCompatible;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Diving into Envoy's hot restart implementation has been quite the journey! It's fascinating to see how they've tackled the challenge of swapping out a running proxy without dropping connections. The elegant dance between parent and child processes, the careful handling of file descriptors, and the intricate state management all come together to make this possible.&lt;/p&gt;

&lt;p&gt;What really stands out is how much thought went into making the system robust. It's not just about passing sockets around – it's about handling edge cases, ensuring configuration compatibility, and providing fallback options when things don't go as planned.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Note: This is a high-level overview based on Envoy's open-source implementation. For the most up-to-date and detailed information, please refer to the official Envoy documentation and source.[&lt;a href="https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/arch_overview" rel="noopener noreferrer"&gt;https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/arch_overview&lt;/a&gt;]&lt;/em&gt;&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
